Using ST_DWithin to see duplicatesFastest way to remove matched pointsFinding poi using st_dwithin in...
Conservation of Mass and Energy
When a wind turbine does not produce enough electricity how does the power company compensate for the loss?
Doesn't allowing a user mode program to access kernel space memory and execute the IN and OUT instructions defeat the purpose of having CPU modes?
Why would one plane in this picture not have gear down yet?
NASA's RS-25 Engines shut down time
Did Carol Danvers really receive a Kree blood tranfusion?
Reversed Sudoku
Is it possible to avoid unpacking when merging Association?
Can Mathematica be used to create an Artistic 3D extrusion from a 2D image and wrap a line pattern around it?
How can I ensure my trip to the UK will not have to be cancelled because of Brexit?
What's the "normal" opposite of flautando?
Filtering SOQL results with optional conditionals
Does this video of collapsing warehouse shelves show a real incident?
Is "conspicuously missing" or "conspicuously" the subject of this sentence?
Are all players supposed to be able to see each others' character sheets?
Contract Factories
Is it "Vierergruppe" or "Viergruppe", or is there a distinction?
Should I tell my boss the work he did was worthless
Recommendation letter by significant other if you worked with them professionally?
How to draw cubes in a 3 dimensional plane
Database Backup for data and log files
Examples of a statistic that is not independent of sample's distribution?
weren't playing vs didn't play
Counting all the hearts
Using ST_DWithin to see duplicates
Fastest way to remove matched pointsFinding poi using st_dwithin in postgisST_Dwithin Source CodeST_DWithin matches wrong pointsCombining ST_DWithin and ST_IntersectsPostGIS ST_DWithin negative distanceST_DWithin Optimization PostgisRemoving duplicates in lines using PostGIS?Using ST_DWithin()Renaming duplicates in PostGIS?
I tried it this way but I think using DWithin might be better for this situation. What would be the correct order to find duplicates which exist within 10km from each other?
SELECT n, ST_ClusterDBSCAN(geog::geometry, eps := .08, minpoints := 2) over () AS cid
from (
select *
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) >= 2) dups
on dups.dn = t.n
order by t.n
) d
EDIT: I guess I would have to do something like this:
select * from (SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,2) = 0) even,
(SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,1) = 0
) odd
where st_dwithin(even.geog,odd.geog,5000)
but this is confusing... maybe it's better to just to do DWithin first but I'm not sure how to do that.
postgis
add a comment |
I tried it this way but I think using DWithin might be better for this situation. What would be the correct order to find duplicates which exist within 10km from each other?
SELECT n, ST_ClusterDBSCAN(geog::geometry, eps := .08, minpoints := 2) over () AS cid
from (
select *
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) >= 2) dups
on dups.dn = t.n
order by t.n
) d
EDIT: I guess I would have to do something like this:
select * from (SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,2) = 0) even,
(SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,1) = 0
) odd
where st_dwithin(even.geog,odd.geog,5000)
but this is confusing... maybe it's better to just to do DWithin first but I'm not sure how to do that.
postgis
Please add more context to your question. You have triedST_ClusterDBSCAN
to find duplicates, correct? Show us the selectST_DWITHIN
statement you would use. If you find a duplicate, which one do you want to keep, and which one should be deleted, or do you want to delete both of them?
– Michael
27 mins ago
I'd like to keep the one with the least null values and I'd like to copy thei
column's value to the other row before deleting
– jaksco
6 mins ago
add a comment |
I tried it this way but I think using DWithin might be better for this situation. What would be the correct order to find duplicates which exist within 10km from each other?
SELECT n, ST_ClusterDBSCAN(geog::geometry, eps := .08, minpoints := 2) over () AS cid
from (
select *
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) >= 2) dups
on dups.dn = t.n
order by t.n
) d
EDIT: I guess I would have to do something like this:
select * from (SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,2) = 0) even,
(SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,1) = 0
) odd
where st_dwithin(even.geog,odd.geog,5000)
but this is confusing... maybe it's better to just to do DWithin first but I'm not sure how to do that.
postgis
I tried it this way but I think using DWithin might be better for this situation. What would be the correct order to find duplicates which exist within 10km from each other?
SELECT n, ST_ClusterDBSCAN(geog::geometry, eps := .08, minpoints := 2) over () AS cid
from (
select *
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) >= 2) dups
on dups.dn = t.n
order by t.n
) d
EDIT: I guess I would have to do something like this:
select * from (SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,2) = 0) even,
(SELECT *
from (
select *, ROW_NUMBER() OVER(order by t.n) as rownum
from cities as t
inner join
(select n dn
from cities as t
group by n
having count(*) = 2
) dups
on dups.dn = t.n
order by t.n
) d
where mod(rownum,1) = 0
) odd
where st_dwithin(even.geog,odd.geog,5000)
but this is confusing... maybe it's better to just to do DWithin first but I'm not sure how to do that.
postgis
postgis
edited 11 mins ago
jaksco
asked 1 hour ago
jakscojaksco
83
83
Please add more context to your question. You have triedST_ClusterDBSCAN
to find duplicates, correct? Show us the selectST_DWITHIN
statement you would use. If you find a duplicate, which one do you want to keep, and which one should be deleted, or do you want to delete both of them?
– Michael
27 mins ago
I'd like to keep the one with the least null values and I'd like to copy thei
column's value to the other row before deleting
– jaksco
6 mins ago
add a comment |
Please add more context to your question. You have triedST_ClusterDBSCAN
to find duplicates, correct? Show us the selectST_DWITHIN
statement you would use. If you find a duplicate, which one do you want to keep, and which one should be deleted, or do you want to delete both of them?
– Michael
27 mins ago
I'd like to keep the one with the least null values and I'd like to copy thei
column's value to the other row before deleting
– jaksco
6 mins ago
Please add more context to your question. You have tried
ST_ClusterDBSCAN
to find duplicates, correct? Show us the select ST_DWITHIN
statement you would use. If you find a duplicate, which one do you want to keep, and which one should be deleted, or do you want to delete both of them?– Michael
27 mins ago
Please add more context to your question. You have tried
ST_ClusterDBSCAN
to find duplicates, correct? Show us the select ST_DWITHIN
statement you would use. If you find a duplicate, which one do you want to keep, and which one should be deleted, or do you want to delete both of them?– Michael
27 mins ago
I'd like to keep the one with the least null values and I'd like to copy the
i
column's value to the other row before deleting– jaksco
6 mins ago
I'd like to keep the one with the least null values and I'd like to copy the
i
column's value to the other row before deleting– jaksco
6 mins ago
add a comment |
1 Answer
1
active
oldest
votes
If your goal is just to identify duplicate records in your data. Then you can use ST_dwithin function like this;
SELECT col1
FROM cities as c1
INNER JOIN cities c2
ON ST_dWithin(geom,10000)
WHERE c1.gid != c2.gid AND c1.col1 = c2.col2
i assumed your data is in projected coordinate system (Unit: meters) and has a unique gid column. The duplication is based on col1, it may be name or any other value which should be unique in 10km radius.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "79"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f315157%2fusing-st-dwithin-to-see-duplicates%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
If your goal is just to identify duplicate records in your data. Then you can use ST_dwithin function like this;
SELECT col1
FROM cities as c1
INNER JOIN cities c2
ON ST_dWithin(geom,10000)
WHERE c1.gid != c2.gid AND c1.col1 = c2.col2
i assumed your data is in projected coordinate system (Unit: meters) and has a unique gid column. The duplication is based on col1, it may be name or any other value which should be unique in 10km radius.
add a comment |
If your goal is just to identify duplicate records in your data. Then you can use ST_dwithin function like this;
SELECT col1
FROM cities as c1
INNER JOIN cities c2
ON ST_dWithin(geom,10000)
WHERE c1.gid != c2.gid AND c1.col1 = c2.col2
i assumed your data is in projected coordinate system (Unit: meters) and has a unique gid column. The duplication is based on col1, it may be name or any other value which should be unique in 10km radius.
add a comment |
If your goal is just to identify duplicate records in your data. Then you can use ST_dwithin function like this;
SELECT col1
FROM cities as c1
INNER JOIN cities c2
ON ST_dWithin(geom,10000)
WHERE c1.gid != c2.gid AND c1.col1 = c2.col2
i assumed your data is in projected coordinate system (Unit: meters) and has a unique gid column. The duplication is based on col1, it may be name or any other value which should be unique in 10km radius.
If your goal is just to identify duplicate records in your data. Then you can use ST_dwithin function like this;
SELECT col1
FROM cities as c1
INNER JOIN cities c2
ON ST_dWithin(geom,10000)
WHERE c1.gid != c2.gid AND c1.col1 = c2.col2
i assumed your data is in projected coordinate system (Unit: meters) and has a unique gid column. The duplication is based on col1, it may be name or any other value which should be unique in 10km radius.
answered 4 mins ago
Shahzad BachaShahzad Bacha
1,2381820
1,2381820
add a comment |
add a comment |
Thanks for contributing an answer to Geographic Information Systems Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f315157%2fusing-st-dwithin-to-see-duplicates%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please add more context to your question. You have tried
ST_ClusterDBSCAN
to find duplicates, correct? Show us the selectST_DWITHIN
statement you would use. If you find a duplicate, which one do you want to keep, and which one should be deleted, or do you want to delete both of them?– Michael
27 mins ago
I'd like to keep the one with the least null values and I'd like to copy the
i
column's value to the other row before deleting– jaksco
6 mins ago