Most effective way to find spatial relationships in pythonSpatial join Python command to summarize...

How to satisfy a player character's curiosity about another player character?

Can the Assuming function be used with ContourPlot or DensityPlot?

Eww, those bytes are gross

Meaning of すきっとした

Should I choose Itemized or Standard deduction?

Metadata API deployments are failing in Spring '19

How to properly claim credit for peer review?

How do Japanese speakers determine the implied topic when none has been mentioned?

Find the number of ways to express 1050 as sum of consecutive integers

What's a good word to describe a public place that looks like it wouldn't be rough?

What can I substitute for soda pop in a sweet pork recipe?

Why is this code uniquely decodable?

Can chords be played on the flute?

Crystal compensation for temp and voltage

What is better: yes / no radio, or simple checkbox?

Has the Isbell–Freyd criterion ever been used to check that a category is concretisable?

Why is my solution for the partial pressures of two different gases incorrect?

Why is commutativity optional in multiplication for rings?

Do commercial flights continue with an engine out?

Table enclosed in curly brackets

The change directory (cd) command is not working with a USB drive

How would an AI self awareness kill switch work?

How to approximate rolls for potions of healing using only d6's?

Charged enclosed by the sphere



Most effective way to find spatial relationships in python


Spatial join Python command to summarize attributesHow to create a line layer which connects a point layer to a line layer?Directional Spatial Join with Python - ArcpyFastest way to join many points to many polygons in pythonHow to do effective spatial joins using spark v2.0 and python?Spatial join using python shapely and fionaFind csv lat and long points in a shapefile polygon with geopandas spatial indexPerforming Spatial Join / match Points from dataframe to polygons using Python?Multiple Spatial Joins with GeoPandas in PythonPlot the spatial join shapefile boundary in Python













0















I am trying to find the spatial relationship between two polygon geoDFs as follows:




  1. Find geoDF1.geom within geoDF2.geom else (not within)

  2. Find geoDF1.geom with largest intersection fromgeoDF2.geom else (no intersection)

  3. Find geoDF1.geom with smallest distance from geoDF2.geom

  4. Update geoDF1 with selected column from from geoDF2


I wrote up a function which seems to works fine, but I am just not sure this is the most effective way to do it. I am especially worried about step #3 which right now checks the distance from each one of geoDF2 rows, which can become memory intensive if it is a large table. Perhaps I should create a buffer that limits the distance checked?



import geopandas as gpd

### FUNCTION that checks spatial relationship between two geodata frames - df1, df2:
# Order of operations: Within -> largest intersection -> smallest distance
# Checks relationship of df1.geom (left) to df2.geom
# Updates df1 with new col from df2:: df2_col
# df1_id = unique id

def relate(df1, df1_id, df2, df2_col):

# join df1_col for all df1_geom within df2_geom
join_df = gpd.sjoin(df1, df2, how='left', op="within")
##sort by col_1
join_df = join_df.sort_values(df1_id)

# get rows that are not within
remaining_df = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

##checks to see relations of df1 to df2
intersects = pd.DataFrame(columns=['id', 'type'])
# check for each geom1 not within geom2
for df1_geom, id in zip(remaining_df.geom, remaining_df[df1_id]):
size = 0
for df2_geom, type in zip(df2.geom, df2[df2_col]):
# see size of intersection , if larger (or first) than previous intersection, set 'sel_type' to df2_col to associate with df2 (largest intersection)
try:
new_size = df1_geom.intersection(df2_geom).area
if new_size > size:
size = new_size
sel_type = type
# print("intersects" + str(id) + sug)
except:
print("couldn't intersect for df1 with id: " + str(id))
# if intersection exists add to lists of df1_id, df2_col
if size > 0:
intersects = intersects.append({'id': id, 'type': sel_type}, ignore_index=True)

# sort by df1_id
ints = intersects.sort_values('id').set_index('id', drop=True)

# set type for intersecting geoms
join_df.loc[join_df[df1_id].isin(ints.index), df2_col] = list(ints['type'])

# #get df1 geom, who are not within and do not intersect with df2_geom
remaining_df2 = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

##!!## JOIN based on distance
distance = pd.DataFrame(columns=['id', 'type'])
# check for each df1 with no df2_col
for geom, id in zip(remaining_df2.geom, remaining_df2[df1_id]):
size = None
for df2_geom, type in zip(df2.geom, df2[df2_col]):
# see size of distance from df2_geom, if smaller (or first) than previous distance, set 'sel_type' to associate with df1_geom (minimal distance)
try:
insize = geom.distance(df2_geom)
if size == None or insize < size:
size = insize
sel_type = type
except:
print("couldn't calculate distance for df1 with id: " + str(id))
if size != None:
distance = distance.append({'id': id, 'type': sel_type}, ignore_index=True)

# sort by df1 id
dis = distance.sort_values('id').set_index('id', drop=True)

# set df2_col type for remaining df1 geoms by minimal distance
join_df.loc[join_df[df1_id].isin(dis.index), df2_col] = list(dis['type'])

##UPDATE columns on original df1 with col to join from df2
df1[df2_col] = [join_df.loc[join_df[df1_id==x], df2_col] for x in df1[df1_id]]








share



























    0















    I am trying to find the spatial relationship between two polygon geoDFs as follows:




    1. Find geoDF1.geom within geoDF2.geom else (not within)

    2. Find geoDF1.geom with largest intersection fromgeoDF2.geom else (no intersection)

    3. Find geoDF1.geom with smallest distance from geoDF2.geom

    4. Update geoDF1 with selected column from from geoDF2


    I wrote up a function which seems to works fine, but I am just not sure this is the most effective way to do it. I am especially worried about step #3 which right now checks the distance from each one of geoDF2 rows, which can become memory intensive if it is a large table. Perhaps I should create a buffer that limits the distance checked?



    import geopandas as gpd

    ### FUNCTION that checks spatial relationship between two geodata frames - df1, df2:
    # Order of operations: Within -> largest intersection -> smallest distance
    # Checks relationship of df1.geom (left) to df2.geom
    # Updates df1 with new col from df2:: df2_col
    # df1_id = unique id

    def relate(df1, df1_id, df2, df2_col):

    # join df1_col for all df1_geom within df2_geom
    join_df = gpd.sjoin(df1, df2, how='left', op="within")
    ##sort by col_1
    join_df = join_df.sort_values(df1_id)

    # get rows that are not within
    remaining_df = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

    ##checks to see relations of df1 to df2
    intersects = pd.DataFrame(columns=['id', 'type'])
    # check for each geom1 not within geom2
    for df1_geom, id in zip(remaining_df.geom, remaining_df[df1_id]):
    size = 0
    for df2_geom, type in zip(df2.geom, df2[df2_col]):
    # see size of intersection , if larger (or first) than previous intersection, set 'sel_type' to df2_col to associate with df2 (largest intersection)
    try:
    new_size = df1_geom.intersection(df2_geom).area
    if new_size > size:
    size = new_size
    sel_type = type
    # print("intersects" + str(id) + sug)
    except:
    print("couldn't intersect for df1 with id: " + str(id))
    # if intersection exists add to lists of df1_id, df2_col
    if size > 0:
    intersects = intersects.append({'id': id, 'type': sel_type}, ignore_index=True)

    # sort by df1_id
    ints = intersects.sort_values('id').set_index('id', drop=True)

    # set type for intersecting geoms
    join_df.loc[join_df[df1_id].isin(ints.index), df2_col] = list(ints['type'])

    # #get df1 geom, who are not within and do not intersect with df2_geom
    remaining_df2 = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

    ##!!## JOIN based on distance
    distance = pd.DataFrame(columns=['id', 'type'])
    # check for each df1 with no df2_col
    for geom, id in zip(remaining_df2.geom, remaining_df2[df1_id]):
    size = None
    for df2_geom, type in zip(df2.geom, df2[df2_col]):
    # see size of distance from df2_geom, if smaller (or first) than previous distance, set 'sel_type' to associate with df1_geom (minimal distance)
    try:
    insize = geom.distance(df2_geom)
    if size == None or insize < size:
    size = insize
    sel_type = type
    except:
    print("couldn't calculate distance for df1 with id: " + str(id))
    if size != None:
    distance = distance.append({'id': id, 'type': sel_type}, ignore_index=True)

    # sort by df1 id
    dis = distance.sort_values('id').set_index('id', drop=True)

    # set df2_col type for remaining df1 geoms by minimal distance
    join_df.loc[join_df[df1_id].isin(dis.index), df2_col] = list(dis['type'])

    ##UPDATE columns on original df1 with col to join from df2
    df1[df2_col] = [join_df.loc[join_df[df1_id==x], df2_col] for x in df1[df1_id]]








    share

























      0












      0








      0








      I am trying to find the spatial relationship between two polygon geoDFs as follows:




      1. Find geoDF1.geom within geoDF2.geom else (not within)

      2. Find geoDF1.geom with largest intersection fromgeoDF2.geom else (no intersection)

      3. Find geoDF1.geom with smallest distance from geoDF2.geom

      4. Update geoDF1 with selected column from from geoDF2


      I wrote up a function which seems to works fine, but I am just not sure this is the most effective way to do it. I am especially worried about step #3 which right now checks the distance from each one of geoDF2 rows, which can become memory intensive if it is a large table. Perhaps I should create a buffer that limits the distance checked?



      import geopandas as gpd

      ### FUNCTION that checks spatial relationship between two geodata frames - df1, df2:
      # Order of operations: Within -> largest intersection -> smallest distance
      # Checks relationship of df1.geom (left) to df2.geom
      # Updates df1 with new col from df2:: df2_col
      # df1_id = unique id

      def relate(df1, df1_id, df2, df2_col):

      # join df1_col for all df1_geom within df2_geom
      join_df = gpd.sjoin(df1, df2, how='left', op="within")
      ##sort by col_1
      join_df = join_df.sort_values(df1_id)

      # get rows that are not within
      remaining_df = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

      ##checks to see relations of df1 to df2
      intersects = pd.DataFrame(columns=['id', 'type'])
      # check for each geom1 not within geom2
      for df1_geom, id in zip(remaining_df.geom, remaining_df[df1_id]):
      size = 0
      for df2_geom, type in zip(df2.geom, df2[df2_col]):
      # see size of intersection , if larger (or first) than previous intersection, set 'sel_type' to df2_col to associate with df2 (largest intersection)
      try:
      new_size = df1_geom.intersection(df2_geom).area
      if new_size > size:
      size = new_size
      sel_type = type
      # print("intersects" + str(id) + sug)
      except:
      print("couldn't intersect for df1 with id: " + str(id))
      # if intersection exists add to lists of df1_id, df2_col
      if size > 0:
      intersects = intersects.append({'id': id, 'type': sel_type}, ignore_index=True)

      # sort by df1_id
      ints = intersects.sort_values('id').set_index('id', drop=True)

      # set type for intersecting geoms
      join_df.loc[join_df[df1_id].isin(ints.index), df2_col] = list(ints['type'])

      # #get df1 geom, who are not within and do not intersect with df2_geom
      remaining_df2 = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

      ##!!## JOIN based on distance
      distance = pd.DataFrame(columns=['id', 'type'])
      # check for each df1 with no df2_col
      for geom, id in zip(remaining_df2.geom, remaining_df2[df1_id]):
      size = None
      for df2_geom, type in zip(df2.geom, df2[df2_col]):
      # see size of distance from df2_geom, if smaller (or first) than previous distance, set 'sel_type' to associate with df1_geom (minimal distance)
      try:
      insize = geom.distance(df2_geom)
      if size == None or insize < size:
      size = insize
      sel_type = type
      except:
      print("couldn't calculate distance for df1 with id: " + str(id))
      if size != None:
      distance = distance.append({'id': id, 'type': sel_type}, ignore_index=True)

      # sort by df1 id
      dis = distance.sort_values('id').set_index('id', drop=True)

      # set df2_col type for remaining df1 geoms by minimal distance
      join_df.loc[join_df[df1_id].isin(dis.index), df2_col] = list(dis['type'])

      ##UPDATE columns on original df1 with col to join from df2
      df1[df2_col] = [join_df.loc[join_df[df1_id==x], df2_col] for x in df1[df1_id]]








      share














      I am trying to find the spatial relationship between two polygon geoDFs as follows:




      1. Find geoDF1.geom within geoDF2.geom else (not within)

      2. Find geoDF1.geom with largest intersection fromgeoDF2.geom else (no intersection)

      3. Find geoDF1.geom with smallest distance from geoDF2.geom

      4. Update geoDF1 with selected column from from geoDF2


      I wrote up a function which seems to works fine, but I am just not sure this is the most effective way to do it. I am especially worried about step #3 which right now checks the distance from each one of geoDF2 rows, which can become memory intensive if it is a large table. Perhaps I should create a buffer that limits the distance checked?



      import geopandas as gpd

      ### FUNCTION that checks spatial relationship between two geodata frames - df1, df2:
      # Order of operations: Within -> largest intersection -> smallest distance
      # Checks relationship of df1.geom (left) to df2.geom
      # Updates df1 with new col from df2:: df2_col
      # df1_id = unique id

      def relate(df1, df1_id, df2, df2_col):

      # join df1_col for all df1_geom within df2_geom
      join_df = gpd.sjoin(df1, df2, how='left', op="within")
      ##sort by col_1
      join_df = join_df.sort_values(df1_id)

      # get rows that are not within
      remaining_df = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

      ##checks to see relations of df1 to df2
      intersects = pd.DataFrame(columns=['id', 'type'])
      # check for each geom1 not within geom2
      for df1_geom, id in zip(remaining_df.geom, remaining_df[df1_id]):
      size = 0
      for df2_geom, type in zip(df2.geom, df2[df2_col]):
      # see size of intersection , if larger (or first) than previous intersection, set 'sel_type' to df2_col to associate with df2 (largest intersection)
      try:
      new_size = df1_geom.intersection(df2_geom).area
      if new_size > size:
      size = new_size
      sel_type = type
      # print("intersects" + str(id) + sug)
      except:
      print("couldn't intersect for df1 with id: " + str(id))
      # if intersection exists add to lists of df1_id, df2_col
      if size > 0:
      intersects = intersects.append({'id': id, 'type': sel_type}, ignore_index=True)

      # sort by df1_id
      ints = intersects.sort_values('id').set_index('id', drop=True)

      # set type for intersecting geoms
      join_df.loc[join_df[df1_id].isin(ints.index), df2_col] = list(ints['type'])

      # #get df1 geom, who are not within and do not intersect with df2_geom
      remaining_df2 = join_df.loc[join_df[df2_col].isna()].reset_index(drop=True).drop(columns='index_right')

      ##!!## JOIN based on distance
      distance = pd.DataFrame(columns=['id', 'type'])
      # check for each df1 with no df2_col
      for geom, id in zip(remaining_df2.geom, remaining_df2[df1_id]):
      size = None
      for df2_geom, type in zip(df2.geom, df2[df2_col]):
      # see size of distance from df2_geom, if smaller (or first) than previous distance, set 'sel_type' to associate with df1_geom (minimal distance)
      try:
      insize = geom.distance(df2_geom)
      if size == None or insize < size:
      size = insize
      sel_type = type
      except:
      print("couldn't calculate distance for df1 with id: " + str(id))
      if size != None:
      distance = distance.append({'id': id, 'type': sel_type}, ignore_index=True)

      # sort by df1 id
      dis = distance.sort_values('id').set_index('id', drop=True)

      # set df2_col type for remaining df1 geoms by minimal distance
      join_df.loc[join_df[df1_id].isin(dis.index), df2_col] = list(dis['type'])

      ##UPDATE columns on original df1 with col to join from df2
      df1[df2_col] = [join_df.loc[join_df[df1_id==x], df2_col] for x in df1[df1_id]]






      python spatial-join geopandas





      share












      share










      share



      share










      asked 49 secs ago









      Eliav S.T.Eliav S.T.

      305




      305






















          0






          active

          oldest

          votes











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "79"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f314253%2fmost-effective-way-to-find-spatial-relationships-in-python%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Geographic Information Systems Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f314253%2fmost-effective-way-to-find-spatial-relationships-in-python%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Щит и меч (фильм) Содержание Названия серий | Сюжет |...

          is 'sed' thread safeWhat should someone know about using Python scripts in the shell?Nexenta bash script uses...

          Meter-Bus Содержание Параметры шины | Стандартизация |...