Upload data

The below calls describe Datasets uploads. For tasks like interactive chart generation, we recommend instead using File uploads, as you can quickly make different datasets out of previously uploaded Files.
Route Method Headers Parameters Return
api/v2/upload/datasets/<dataset_id>/nodes/json
api/v2/upload/datasets/<dataset_id>/edges/json
POST Content-Type: application/json
Authorization: Bearer YOUR_JWT_TOKEN
Query (url) parameters (see cudf.io.json.read_json()):
{
  ?compression: str,
  ?dtype: bool | dict,
  ?lines: bool,
  ?orient: 'split', 'records', 'index', 'columns', 'values', 'table'
}
        
Body version 1 (orient=records): row-based - list of row objects
[
  {<column_name> 'a, ... },
  ...
]
      
Body version 2 (inferred): columnar - record of column arrays
{
  <column_name>: [ ... ],
...
}
      
Body version 3 (orient=records, lines=True): json logs - object per line
{<column_name>: 'a, ... },
...
      
{
  "data": {
    "dataset_id": str,
    "dtypes": {<str>: str},
    "num_cols": int,
    "num_rows": int,
    "time_parsing_s": int
  },
  "message": str,
  "success": bool
}                    
        
Input:
edges_columnar.json:
{
  "s": ["a", "b", "c"],
  "d": ["b", "c", "a"],
  "prop1": [2, 4, 6]
}
        
Upload:

curl -X POST \
  -H "Authorization: Bearer my_generated_token" \
  -T edges_columnar.json \
  "http://localhost/api/v2/upload/datasets/my_generated_dataset_id/edges/json?dtype={\"prop1\":\"int32\"}"
        
Output:

{
  "data":{
    "dataset_id": "e176bfb2813947e198ccf371c44a5972",
    "dtypes": {"d": "object", "prop1": "int32", "s": "object"},
    "num_cols": 3,
    "num_rows": 3,
    "time_parsing_s": 0
  },
  "message": "Dataset edges created",
  "success":true
}
api/v2/upload/datasets/<dataset_id>/nodes/csv
api/v2/upload/datasets/<dataset_id>/edges/csv
POST Authorization: Bearer YOUR_JWT_TOKEN Query (url) parameters (see cudf.io.csv.read_csv and pandas.read_csv):
{
  ?sep: str,
  ?delim_whitespace: bool,
  ?lineterminator: str,
  ?skipinitialspace: bool,
  ?names: arr,
  ?dtype: list | dict,
  ?quotechar: str,
  ?quoting: int,
  ?doublequote: bool,
  ?encoding: str,
  ?header: int | 'infer',
  ?usecols: list<int> | list<str>,
  ?mangle_dupe_cols: bool,
  ?skiprows: int,
  ?skipfooter: int,
  ?compression: 'infer' | 'gzip'| 'zip',
  ?decimal: str,
  ?thousands: str,
  ?true_values: list,
  ?false_values': list,
  ?nrows: int,
  ?byte_range: [int, int],
  ?skip_blank_lines: bool,
  ?parse_dates: list<int>| list<str>,
  ?comment: str,
  ?na_values: list,
  ?keep_default_na: bool,
  ?na_filter: bool,
  ?prefix: str
}
        
Body: row-based
header1,header2,...
val1,val2,...
...
        
{
  "data": {
    "dataset_id": str,
    "dtypes": {<str>: str},
    "num_cols": int,
    "num_rows": int,
    "time_parsing_s": int
  },
  "message": str,
  "success": bool
}                    
        
Input:
edges.csv:
s,d,prop1
a,b,2
b,c,4
c,a,6
        
Upload:

curl -X POST \
  -H "Authorization: Bearer my_generated_token" \
  -T edges.csv \
  "http://localhost/api/v2/upload/datasets/my_generated_dataset_id/edges/csv?dtype=\{\"prop1\"":\"int32\"\}"
Output:

{
  "data":{
    "dataset_id": "e176bfb2813947e198ccf371c44a5972",
    "dtypes": {"d": "object", "prop1": "int32", "s": "object"},
    "num_cols": 3,
    "num_rows": 3,
    "time_parsing_s": 0
  },
  "message": "Dataset edges created",
  "success":true
}
api/v2/upload/datasets/<dataset_id>/nodes/parquet
api/v2/upload/datasets/<dataset_id>/edges/parquet
POST Authorization: Bearer YOUR_JWT_TOKEN Query (url) parameters (see cudf.io.parquet.read_parquet()):
{
  ?columns: list,
  ?row_group': int,
  ?skip_rows': int,
  ?num_rows': int
}
        
Body: See CSV format
{
  "data": {
    "dataset_id": str,
    "dtypes": {<str>: str},
    "num_cols": int,
    "num_rows": int,
    "time_parsing_s": int
  },
  "message": str,
  "success": bool
}                    
        
Input:
edges.parquet: see CSV example
Upload:

curl -X POST \
  -H "Authorization: Bearer my_generated_token" \
  -T edges.parquet \
  "http://localhost/api/v2/upload/datasets/my_generated_dataset_id/edges/parquet"
Output:

{
  "data":{
    "dataset_id": "e176bfb2813947e198ccf371c44a5972",
    "dtypes": {"d": "object", "prop1": "int32", "s": "object"},
    "num_cols": 3,
    "num_rows": 3,
    "time_parsing_s": 0
  },
  "message": "Dataset edges created",
  "success":true
}
api/v2/upload/datasets/<dataset_id>/nodes/orc
api/v2/upload/datasets/<dataset_id>/edges/orc
POST Authorization: Bearer YOUR_JWT_TOKEN Query (url) parameters (see cudf.io.orc.read_orc()):
{
  ?columns: list,
  ?stripe: int,
  ?skip_rows: int,
  ?num_rows: int
}
        
Body: See CSV format
{
  "data": {
    "dataset_id": str,
    "dtypes": {<str>: str},
    "num_cols": int,
    "num_rows": int,
    "time_parsing_s": int
  },
  "message": str,
  "success": bool
}                    
        
Input:
edges.orc: see CSV example
Upload:

curl -X POST \
-H "Authorization: Bearer my_generated_token" \
-T edges.orc \
"http://localhost/api/v2/upload/datasets/my_generated_dataset_id/edges/orc"
Output:

{
  "data":{
    "dataset_id": "e176bfb2813947e198ccf371c44a5972",
    "dtypes": {"d": "object", "prop1": "int32", "s": "object"},
    "num_cols": 3,
    "num_rows": 3,
    "time_parsing_s": 0
  },
  "message": "Dataset edges created",
  "success":true
}
api/v2/upload/datasets/<dataset_id>/nodes/arrow
api/v2/upload/datasets/<dataset_id>/edges/arrow
POST Authorization: Bearer YOUR_JWT_TOKEN Query (url) parameters: none
Body: See CSV format
{
  "data": {
    "dataset_id": str,
    "dtypes": {<str>: str},
    "num_cols": int,
    "num_rows": int,
    "time_parsing_s": int
  },
  "message": str,
  "success": bool
}                    
        
Input:
edges.arrow: see CSV example
Upload:

curl -X POST \
  -H "Authorization: Bearer my_generated_token" \
  -T edges.arrow \
  "http://localhost/api/v2/upload/datasets/my_generated_dataset_id/edges/arrow"
Output:

{
  "data":{
    "dataset_id": "e176bfb2813947e198ccf371c44a5972",
    "dtypes": {"d": "object", "prop1": "int32", "s": "object"},
    "num_cols": 3,
    "num_rows": 3,
    "time_parsing_s": 0
  },
  "message": "Dataset edges created",
  "success":true
}