Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Description
Partition input data sources by key values and save the results to a partitioned Xdf on disk.
Usage
rxPartition(inData, outData, varsToPartition, append = "rows", overwrite = FALSE, ...)
Arguments
inData
either a data source object, a character string specifying a .xdf file, or a data frame object.
outData
a partitioned data source object created by RxXdfData with createPartitionSet = TRUE
.
varsToPartition
character vector of variable names to specify the values in those variables to be used for partitioning
append
either "none" to create a new files or "rows" to append rows to an existing file. If outData exists and append is "none", the overwrite
argument must be set to TRUE
.
overwrite
logical value. If TRUE
, an existing outData
will be overwritten. overwrite
is ignored if append = "rows"
.
...
additional arguments to be passed directly to the Revolution Compute Engine.
Value
a data frame of partitioning values and data sources, each row in the data frame represents one partition and the data source in the last variable holds the data of a specifict partition.
Note
In the current version, this function is single threaded.
Author(s)
Microsoft Corporation Microsoft Technical Support
See Also
Examples
##############################################################################
# Construct a partitioned Xdf
##############################################################################
# create an input Xdf data source
inFile <- "claims.xdf"
inFile <- file.path(dataPath = rxGetOption(opt = "sampleDataDir"), inFile)
inXdfDS <- RxXdfData(file = inFile)
# create an output partitioned Xdf data source
outFile <- file.path(tempdir(), "partitionedClaims.xdf")
outPartXdfDataSource <- RxXdfData(file = outFile, createPartitionSet = TRUE)
# construct and save the partitioned Xdf to disk
partDF <- rxPartition(inData = inXdfDS, outData = outPartXdfDataSource, varsToPartition = c("car.age"))
##############################################################################
# Append new data to an existing partitioned Xdf
##############################################################################
# create two sets of data frames from Xdf data source
inFile <- "claims.xdf"
inFile <- file.path(dataPath = rxGetOption(opt = "sampleDataDir"), inFile)
inXdfDS <- RxXdfData(file = inFile)
inDF <- rxImport(inData = inXdfDS)
df1 <- inDF[1:50,]
df2 <- inDF[51:nrow(inDF),]
# create an output partitioned Xdf data source
outFile <- file.path(tempdir(), "partitionedClaims.xdf")
outPartXdfDataSource <- RxXdfData(file = outFile, createPartitionSet = TRUE)
# construct the partitioned Xdf from the first data set df1 and save to disk
partDF1 <- rxPartition(inData = df1, outData = outPartXdfDataSource, varsToPartition = c("car.age", "type"), append = "none", overwrite = TRUE)
# append data from the second data set to the existing partitioned Xdf
partDF2 <- rxPartition(inData = df2, outData = outPartXdfDataSource, varsToPartition = c("car.age", "type"))
# overwrite an existing partitioned Xdf
partDF2 <- rxPartition(inData = inXdfDS, outData = outPartXdfDataSource, varsToPartition = c("car.age"), append = "none", overwrite = TRUE)