R - multicore approach to extract raster values using spatial pointsHow to extract values from rasters at...

Does diversity provide anything that meritocracy does not?

How big is a framed opening for a door relative to the finished door opening width?

How much light is too much?

Charging phone battery with a lower voltage, coming from a bike charger?

Critique vs nitpicking

Concatenating two int[]

Prevent Nautilus / Nemo from creating .Trash-1000 folder in mounted devices

Closed set in topological space generated by sets of the form [a, b).

Why is that max-Q doesn't occur in transonic regime?

Can we "borrow" our answers to populate our own websites?

Word for something that's always reliable, but never the best?

Why didn't the 2019 Oscars have a host?

Memory usage: #define vs. static const for uint8_t

What are some ways of extending a description of a scenery?

A starship is travelling at 0.9c and collides with a small rock. Will it leave a clean hole through, or will more happen?

Reading Mishnayos without understanding

Caron Accent v{a} doesn't render without usepackage{xeCJK}

"Starve to death" Vs. "Starve to the point of death"

A question about partitioning positivie integers into finitely many arithmetic progresions

Book where a space ship journeys to the center of the galaxy to find all the stars had gone supernova

A fantasy book with seven white haired women on the cover

Need help with a circuit diagram where the motor does not seem to have any connection to ground. Error with diagram? Or am i missing something?

Converting very wide logos to square formats

Equivalent of "illegal" for violating civil law



R - multicore approach to extract raster values using spatial points


How to extract values from rasters at location of points in R?Really slow extraction from raster even after using crop?Creating raster layer from data frame in R?Function argument for all values in R (raster extract, and velox raster extract)Extracting Spatial Points that match Raster Value using R?Extract median value for polygons from multiresolution raster data, using R (velox, raster -package)dismo::randomPoints generates NA points with raster stack maskExtract raster values from point using GDALReclassifying raster/extracting pixel values conditionall over multipolygon shapefileUnderstanding Extract Multi Values to Points performance in ArcPy?













4















It might look like this question is duplicated, but I'm asking about data extraction by points, rather than by polygons, and I couldn't find hints about point extraction. Therefore, please bare with me.



I am working with a massive amount of climate files whose values need to be extracted and converted to data frames in order to serve as a database for a Shiny application.



Specifically, I need to extract daily temperature values from a raster stack for 655 locations (points) and store those values in a data frame. Given the amount of data, I estimate the final data frame to be around 30 GB in size.



Since it is such a big amount of data, I am looking to use a multicore (parallel) approach that benefits from a quad core (8 threads) system. I also consider to process these data on an Amazon EC2 instance, so I need to speed up the code as much as I can in order to prevent wasting time (i.e. money).



I know that the raster package provides the multicore function beginCluster that might help me, but according to the documentation it only works when extracting data using polygons, not points.



Please take a look at the code that reproduces my procedure using a simplified version of my raster stack:



library(raster)

# Create date sequence
idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

# Create raster stack and assign dates
# WARNING: raster stack will be ~ 400 MB in size
r <- raster(ncol=5, nrow=5)
s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
s <- setZ(s, idx)

# Create random spatial points
pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
y=runif(655, -90, 90)),
proj4string=CRS(projection(s)))

# Extract values to a data frame
e.df <- extract(s, pts, df=TRUE)

# Fix resulting data frame
DF <- data.frame(t(e.df[-1])); row.names(DF)=NULL
DF <- cbind(getZ(s), DF)
names(DF)[1] <- "Date";
names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))


In the code above, clearly the slowest part of the process is the data extraction itself.



Is there any way to improve the speed of this data extraction procedure?



Any way to implement it under a multicore function?










share|improve this question




















  • 1





    I don't see where your 30Gb comes from. 89 years x 365 days x 655 stations is 21 million. You need to be handling about 1000 bytes for each of those units to get 30Gb, where one temperature measurement is 8 bytes. 160-250Mb maybe, which is easily stored in RAM.

    – Spacedman
    Aug 29 '17 at 10:10








  • 1





    This is "trivially parallel" over layers in your raster stack and over points in your spatial points. Use functions from the parallel package.

    – Spacedman
    Aug 29 '17 at 10:15











  • @Spacedman my actual raster stack is 100x95 at 0.5 deg resolution with 54750 "layers" (i.e. time slices). Also, I have 32 files like that that I need to extract data from (different climate scenarios and models). I just thought it was too big of an object for users to reproduce here.

    – thiagoveloso
    Aug 29 '17 at 10:46













  • @Spacedman I would also appreciate any more detailed suggestion regarding the parallel package. Not very familiar with multicore processing here...

    – thiagoveloso
    Aug 29 '17 at 10:48






  • 1





    Suggest you find some R parallel tutorials, things like this cran.r-project.org/web/packages/doParallel/vignettes/… might get you started.

    – Spacedman
    Aug 29 '17 at 10:55
















4















It might look like this question is duplicated, but I'm asking about data extraction by points, rather than by polygons, and I couldn't find hints about point extraction. Therefore, please bare with me.



I am working with a massive amount of climate files whose values need to be extracted and converted to data frames in order to serve as a database for a Shiny application.



Specifically, I need to extract daily temperature values from a raster stack for 655 locations (points) and store those values in a data frame. Given the amount of data, I estimate the final data frame to be around 30 GB in size.



Since it is such a big amount of data, I am looking to use a multicore (parallel) approach that benefits from a quad core (8 threads) system. I also consider to process these data on an Amazon EC2 instance, so I need to speed up the code as much as I can in order to prevent wasting time (i.e. money).



I know that the raster package provides the multicore function beginCluster that might help me, but according to the documentation it only works when extracting data using polygons, not points.



Please take a look at the code that reproduces my procedure using a simplified version of my raster stack:



library(raster)

# Create date sequence
idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

# Create raster stack and assign dates
# WARNING: raster stack will be ~ 400 MB in size
r <- raster(ncol=5, nrow=5)
s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
s <- setZ(s, idx)

# Create random spatial points
pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
y=runif(655, -90, 90)),
proj4string=CRS(projection(s)))

# Extract values to a data frame
e.df <- extract(s, pts, df=TRUE)

# Fix resulting data frame
DF <- data.frame(t(e.df[-1])); row.names(DF)=NULL
DF <- cbind(getZ(s), DF)
names(DF)[1] <- "Date";
names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))


In the code above, clearly the slowest part of the process is the data extraction itself.



Is there any way to improve the speed of this data extraction procedure?



Any way to implement it under a multicore function?










share|improve this question




















  • 1





    I don't see where your 30Gb comes from. 89 years x 365 days x 655 stations is 21 million. You need to be handling about 1000 bytes for each of those units to get 30Gb, where one temperature measurement is 8 bytes. 160-250Mb maybe, which is easily stored in RAM.

    – Spacedman
    Aug 29 '17 at 10:10








  • 1





    This is "trivially parallel" over layers in your raster stack and over points in your spatial points. Use functions from the parallel package.

    – Spacedman
    Aug 29 '17 at 10:15











  • @Spacedman my actual raster stack is 100x95 at 0.5 deg resolution with 54750 "layers" (i.e. time slices). Also, I have 32 files like that that I need to extract data from (different climate scenarios and models). I just thought it was too big of an object for users to reproduce here.

    – thiagoveloso
    Aug 29 '17 at 10:46













  • @Spacedman I would also appreciate any more detailed suggestion regarding the parallel package. Not very familiar with multicore processing here...

    – thiagoveloso
    Aug 29 '17 at 10:48






  • 1





    Suggest you find some R parallel tutorials, things like this cran.r-project.org/web/packages/doParallel/vignettes/… might get you started.

    – Spacedman
    Aug 29 '17 at 10:55














4












4








4








It might look like this question is duplicated, but I'm asking about data extraction by points, rather than by polygons, and I couldn't find hints about point extraction. Therefore, please bare with me.



I am working with a massive amount of climate files whose values need to be extracted and converted to data frames in order to serve as a database for a Shiny application.



Specifically, I need to extract daily temperature values from a raster stack for 655 locations (points) and store those values in a data frame. Given the amount of data, I estimate the final data frame to be around 30 GB in size.



Since it is such a big amount of data, I am looking to use a multicore (parallel) approach that benefits from a quad core (8 threads) system. I also consider to process these data on an Amazon EC2 instance, so I need to speed up the code as much as I can in order to prevent wasting time (i.e. money).



I know that the raster package provides the multicore function beginCluster that might help me, but according to the documentation it only works when extracting data using polygons, not points.



Please take a look at the code that reproduces my procedure using a simplified version of my raster stack:



library(raster)

# Create date sequence
idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

# Create raster stack and assign dates
# WARNING: raster stack will be ~ 400 MB in size
r <- raster(ncol=5, nrow=5)
s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
s <- setZ(s, idx)

# Create random spatial points
pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
y=runif(655, -90, 90)),
proj4string=CRS(projection(s)))

# Extract values to a data frame
e.df <- extract(s, pts, df=TRUE)

# Fix resulting data frame
DF <- data.frame(t(e.df[-1])); row.names(DF)=NULL
DF <- cbind(getZ(s), DF)
names(DF)[1] <- "Date";
names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))


In the code above, clearly the slowest part of the process is the data extraction itself.



Is there any way to improve the speed of this data extraction procedure?



Any way to implement it under a multicore function?










share|improve this question
















It might look like this question is duplicated, but I'm asking about data extraction by points, rather than by polygons, and I couldn't find hints about point extraction. Therefore, please bare with me.



I am working with a massive amount of climate files whose values need to be extracted and converted to data frames in order to serve as a database for a Shiny application.



Specifically, I need to extract daily temperature values from a raster stack for 655 locations (points) and store those values in a data frame. Given the amount of data, I estimate the final data frame to be around 30 GB in size.



Since it is such a big amount of data, I am looking to use a multicore (parallel) approach that benefits from a quad core (8 threads) system. I also consider to process these data on an Amazon EC2 instance, so I need to speed up the code as much as I can in order to prevent wasting time (i.e. money).



I know that the raster package provides the multicore function beginCluster that might help me, but according to the documentation it only works when extracting data using polygons, not points.



Please take a look at the code that reproduces my procedure using a simplified version of my raster stack:



library(raster)

# Create date sequence
idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

# Create raster stack and assign dates
# WARNING: raster stack will be ~ 400 MB in size
r <- raster(ncol=5, nrow=5)
s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
s <- setZ(s, idx)

# Create random spatial points
pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
y=runif(655, -90, 90)),
proj4string=CRS(projection(s)))

# Extract values to a data frame
e.df <- extract(s, pts, df=TRUE)

# Fix resulting data frame
DF <- data.frame(t(e.df[-1])); row.names(DF)=NULL
DF <- cbind(getZ(s), DF)
names(DF)[1] <- "Date";
names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))


In the code above, clearly the slowest part of the process is the data extraction itself.



Is there any way to improve the speed of this data extraction procedure?



Any way to implement it under a multicore function?







raster r point extract parallel-processing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 29 '17 at 21:02









PolyGeo

53.6k1780240




53.6k1780240










asked Aug 29 '17 at 8:56









thiagovelosothiagoveloso

247312




247312








  • 1





    I don't see where your 30Gb comes from. 89 years x 365 days x 655 stations is 21 million. You need to be handling about 1000 bytes for each of those units to get 30Gb, where one temperature measurement is 8 bytes. 160-250Mb maybe, which is easily stored in RAM.

    – Spacedman
    Aug 29 '17 at 10:10








  • 1





    This is "trivially parallel" over layers in your raster stack and over points in your spatial points. Use functions from the parallel package.

    – Spacedman
    Aug 29 '17 at 10:15











  • @Spacedman my actual raster stack is 100x95 at 0.5 deg resolution with 54750 "layers" (i.e. time slices). Also, I have 32 files like that that I need to extract data from (different climate scenarios and models). I just thought it was too big of an object for users to reproduce here.

    – thiagoveloso
    Aug 29 '17 at 10:46













  • @Spacedman I would also appreciate any more detailed suggestion regarding the parallel package. Not very familiar with multicore processing here...

    – thiagoveloso
    Aug 29 '17 at 10:48






  • 1





    Suggest you find some R parallel tutorials, things like this cran.r-project.org/web/packages/doParallel/vignettes/… might get you started.

    – Spacedman
    Aug 29 '17 at 10:55














  • 1





    I don't see where your 30Gb comes from. 89 years x 365 days x 655 stations is 21 million. You need to be handling about 1000 bytes for each of those units to get 30Gb, where one temperature measurement is 8 bytes. 160-250Mb maybe, which is easily stored in RAM.

    – Spacedman
    Aug 29 '17 at 10:10








  • 1





    This is "trivially parallel" over layers in your raster stack and over points in your spatial points. Use functions from the parallel package.

    – Spacedman
    Aug 29 '17 at 10:15











  • @Spacedman my actual raster stack is 100x95 at 0.5 deg resolution with 54750 "layers" (i.e. time slices). Also, I have 32 files like that that I need to extract data from (different climate scenarios and models). I just thought it was too big of an object for users to reproduce here.

    – thiagoveloso
    Aug 29 '17 at 10:46













  • @Spacedman I would also appreciate any more detailed suggestion regarding the parallel package. Not very familiar with multicore processing here...

    – thiagoveloso
    Aug 29 '17 at 10:48






  • 1





    Suggest you find some R parallel tutorials, things like this cran.r-project.org/web/packages/doParallel/vignettes/… might get you started.

    – Spacedman
    Aug 29 '17 at 10:55








1




1





I don't see where your 30Gb comes from. 89 years x 365 days x 655 stations is 21 million. You need to be handling about 1000 bytes for each of those units to get 30Gb, where one temperature measurement is 8 bytes. 160-250Mb maybe, which is easily stored in RAM.

– Spacedman
Aug 29 '17 at 10:10







I don't see where your 30Gb comes from. 89 years x 365 days x 655 stations is 21 million. You need to be handling about 1000 bytes for each of those units to get 30Gb, where one temperature measurement is 8 bytes. 160-250Mb maybe, which is easily stored in RAM.

– Spacedman
Aug 29 '17 at 10:10






1




1





This is "trivially parallel" over layers in your raster stack and over points in your spatial points. Use functions from the parallel package.

– Spacedman
Aug 29 '17 at 10:15





This is "trivially parallel" over layers in your raster stack and over points in your spatial points. Use functions from the parallel package.

– Spacedman
Aug 29 '17 at 10:15













@Spacedman my actual raster stack is 100x95 at 0.5 deg resolution with 54750 "layers" (i.e. time slices). Also, I have 32 files like that that I need to extract data from (different climate scenarios and models). I just thought it was too big of an object for users to reproduce here.

– thiagoveloso
Aug 29 '17 at 10:46







@Spacedman my actual raster stack is 100x95 at 0.5 deg resolution with 54750 "layers" (i.e. time slices). Also, I have 32 files like that that I need to extract data from (different climate scenarios and models). I just thought it was too big of an object for users to reproduce here.

– thiagoveloso
Aug 29 '17 at 10:46















@Spacedman I would also appreciate any more detailed suggestion regarding the parallel package. Not very familiar with multicore processing here...

– thiagoveloso
Aug 29 '17 at 10:48





@Spacedman I would also appreciate any more detailed suggestion regarding the parallel package. Not very familiar with multicore processing here...

– thiagoveloso
Aug 29 '17 at 10:48




1




1





Suggest you find some R parallel tutorials, things like this cran.r-project.org/web/packages/doParallel/vignettes/… might get you started.

– Spacedman
Aug 29 '17 at 10:55





Suggest you find some R parallel tutorials, things like this cran.r-project.org/web/packages/doParallel/vignettes/… might get you started.

– Spacedman
Aug 29 '17 at 10:55










1 Answer
1






active

oldest

votes


















5














I ended up using an approach based on the snowfall package. It is quite simple, works really good and the point extraction function is as fast as the number of cores that you can use. The approach I used was inspired by this post, and here is my reproducible example:



library(raster)
library(snowfall)

# Create date sequence
idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

# Create raster stack and assign dates
# WARNING: raster stack will be ~ 400 MB in size
r <- raster(ncol=5, nrow=5)
s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
s <- setZ(s, idx)

# Create random spatial points
pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
y=runif(655, -90, 90)),
proj4string=CRS(projection(s)))

# Extract values to a data frame - multicore approach
# First, convert raster stack to list of single raster layers
s.list <- unstack(s)
names(s.list) <- names(s)

# Now, create a R cluster using all the machine cores minus one
sfInit(parallel=TRUE, cpus=parallel:::detectCores()-1)

# Load the required packages inside the cluster
sfLibrary(raster)
sfLibrary(sp)

# Run parallelized 'extract' function and stop cluster
e.df <- sfSapply(s.list, extract, y=pts)
sfStop()

# Fix resulting data frame
DF <- data.frame(t(e.df)); row.names(DF)=NULL
DF <- cbind(getZ(s), DF)
names(DF)[1] <- "Date";
names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))

# Check resulting data frame
head(DF)


I hope it can serve as a future reference for people interested in doing the same.






share|improve this answer

























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "79"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f253618%2fr-multicore-approach-to-extract-raster-values-using-spatial-points%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5














    I ended up using an approach based on the snowfall package. It is quite simple, works really good and the point extraction function is as fast as the number of cores that you can use. The approach I used was inspired by this post, and here is my reproducible example:



    library(raster)
    library(snowfall)

    # Create date sequence
    idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

    # Create raster stack and assign dates
    # WARNING: raster stack will be ~ 400 MB in size
    r <- raster(ncol=5, nrow=5)
    s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
    s <- setZ(s, idx)

    # Create random spatial points
    pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
    y=runif(655, -90, 90)),
    proj4string=CRS(projection(s)))

    # Extract values to a data frame - multicore approach
    # First, convert raster stack to list of single raster layers
    s.list <- unstack(s)
    names(s.list) <- names(s)

    # Now, create a R cluster using all the machine cores minus one
    sfInit(parallel=TRUE, cpus=parallel:::detectCores()-1)

    # Load the required packages inside the cluster
    sfLibrary(raster)
    sfLibrary(sp)

    # Run parallelized 'extract' function and stop cluster
    e.df <- sfSapply(s.list, extract, y=pts)
    sfStop()

    # Fix resulting data frame
    DF <- data.frame(t(e.df)); row.names(DF)=NULL
    DF <- cbind(getZ(s), DF)
    names(DF)[1] <- "Date";
    names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))

    # Check resulting data frame
    head(DF)


    I hope it can serve as a future reference for people interested in doing the same.






    share|improve this answer






























      5














      I ended up using an approach based on the snowfall package. It is quite simple, works really good and the point extraction function is as fast as the number of cores that you can use. The approach I used was inspired by this post, and here is my reproducible example:



      library(raster)
      library(snowfall)

      # Create date sequence
      idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

      # Create raster stack and assign dates
      # WARNING: raster stack will be ~ 400 MB in size
      r <- raster(ncol=5, nrow=5)
      s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
      s <- setZ(s, idx)

      # Create random spatial points
      pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
      y=runif(655, -90, 90)),
      proj4string=CRS(projection(s)))

      # Extract values to a data frame - multicore approach
      # First, convert raster stack to list of single raster layers
      s.list <- unstack(s)
      names(s.list) <- names(s)

      # Now, create a R cluster using all the machine cores minus one
      sfInit(parallel=TRUE, cpus=parallel:::detectCores()-1)

      # Load the required packages inside the cluster
      sfLibrary(raster)
      sfLibrary(sp)

      # Run parallelized 'extract' function and stop cluster
      e.df <- sfSapply(s.list, extract, y=pts)
      sfStop()

      # Fix resulting data frame
      DF <- data.frame(t(e.df)); row.names(DF)=NULL
      DF <- cbind(getZ(s), DF)
      names(DF)[1] <- "Date";
      names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))

      # Check resulting data frame
      head(DF)


      I hope it can serve as a future reference for people interested in doing the same.






      share|improve this answer




























        5












        5








        5







        I ended up using an approach based on the snowfall package. It is quite simple, works really good and the point extraction function is as fast as the number of cores that you can use. The approach I used was inspired by this post, and here is my reproducible example:



        library(raster)
        library(snowfall)

        # Create date sequence
        idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

        # Create raster stack and assign dates
        # WARNING: raster stack will be ~ 400 MB in size
        r <- raster(ncol=5, nrow=5)
        s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
        s <- setZ(s, idx)

        # Create random spatial points
        pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
        y=runif(655, -90, 90)),
        proj4string=CRS(projection(s)))

        # Extract values to a data frame - multicore approach
        # First, convert raster stack to list of single raster layers
        s.list <- unstack(s)
        names(s.list) <- names(s)

        # Now, create a R cluster using all the machine cores minus one
        sfInit(parallel=TRUE, cpus=parallel:::detectCores()-1)

        # Load the required packages inside the cluster
        sfLibrary(raster)
        sfLibrary(sp)

        # Run parallelized 'extract' function and stop cluster
        e.df <- sfSapply(s.list, extract, y=pts)
        sfStop()

        # Fix resulting data frame
        DF <- data.frame(t(e.df)); row.names(DF)=NULL
        DF <- cbind(getZ(s), DF)
        names(DF)[1] <- "Date";
        names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))

        # Check resulting data frame
        head(DF)


        I hope it can serve as a future reference for people interested in doing the same.






        share|improve this answer















        I ended up using an approach based on the snowfall package. It is quite simple, works really good and the point extraction function is as fast as the number of cores that you can use. The approach I used was inspired by this post, and here is my reproducible example:



        library(raster)
        library(snowfall)

        # Create date sequence
        idx <- seq(as.Date("2010/1/1"), as.Date("2099/12/31"), by = "day")

        # Create raster stack and assign dates
        # WARNING: raster stack will be ~ 400 MB in size
        r <- raster(ncol=5, nrow=5)
        s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
        s <- setZ(s, idx)

        # Create random spatial points
        pts <- SpatialPoints(cbind(x=runif(655, -180, 180),
        y=runif(655, -90, 90)),
        proj4string=CRS(projection(s)))

        # Extract values to a data frame - multicore approach
        # First, convert raster stack to list of single raster layers
        s.list <- unstack(s)
        names(s.list) <- names(s)

        # Now, create a R cluster using all the machine cores minus one
        sfInit(parallel=TRUE, cpus=parallel:::detectCores()-1)

        # Load the required packages inside the cluster
        sfLibrary(raster)
        sfLibrary(sp)

        # Run parallelized 'extract' function and stop cluster
        e.df <- sfSapply(s.list, extract, y=pts)
        sfStop()

        # Fix resulting data frame
        DF <- data.frame(t(e.df)); row.names(DF)=NULL
        DF <- cbind(getZ(s), DF)
        names(DF)[1] <- "Date";
        names(DF)[2:length(names(DF))] <- paste0('Point ', 1:length(pts))

        # Check resulting data frame
        head(DF)


        I hope it can serve as a future reference for people interested in doing the same.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 11 mins ago

























        answered Sep 6 '17 at 20:59









        thiagovelosothiagoveloso

        247312




        247312






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Geographic Information Systems Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f253618%2fr-multicore-approach-to-extract-raster-values-using-spatial-points%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Щит и меч (фильм) Содержание Названия серий | Сюжет |...

            is 'sed' thread safeWhat should someone know about using Python scripts in the shell?Nexenta bash script uses...

            Meter-Bus Содержание Параметры шины | Стандартизация |...