Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Bug
-
None
-
None
-
None
Description
I wrote a simple CSV to Parquet converter at https://github.com/domoritz/csv2parquet/blob/f53feb5bd995eab41dee09f2c4d722512052d7ca/src/main.rs.
Running it (`csv2parquet test.txt test.parquet`) with a simple file such as
```
a,b,c
0,1,hello world
0,1,hello world
0,1,hello world
0,1,hello world
0,1,hello world
0,1,hello world
0,1,hello world
```
And then trying to read in Python with
```
import pandas as pd
df = pd.read_parquet('test.parquet')
df.to_csv('test2.csv')
```
Results in this error
```
OSError: Could not open parquet input source '<Buffer>': Invalid: Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.
```
The schema seems to be inferred correctly
```
Inferred Schema:
{
"fields": [
{
"name": "a",
"nullable": false,
"type":
,
"children": []
},
{
"name": "b",
"nullable": false,
"type":
,
"children": []
},
{
"name": "c",
"nullable": false,
"type":
,
"children": []
}
],
"metadata": {}
}
```