add datasets and other usefull files
This commit is contained in:
31
README.md
Normal file
31
README.md
Normal file
@@ -0,0 +1,31 @@
|
||||
|
||||
# Directory structure of the project
|
||||
|
||||
## Virtual environment
|
||||
|
||||
The following folders and files are part of the python venv directory structure : `bin/`, `include/`, `lib/`, `share/`, and `pyvenv.cfg`.
|
||||
|
||||
The `requirements.txt` file lists the python packages required for the project.
|
||||
They should be already installed, but in case you reset the venv, you can reinstall them with `python3 -m pip install -r requirements.txt`
|
||||
|
||||
## Source code
|
||||
|
||||
All python source code is inside the `src/` directory.
|
||||
|
||||
## Datasets
|
||||
|
||||
Datasets are stored inside specific directories.
|
||||
|
||||
Let's say you have a dataset named `XLII`.
|
||||
|
||||
- All files relative to the dataset must be inside the `XLII_dataset/` folder
|
||||
- The `.csv` files containing the original data must be placed inside the `XLII_dataset/csv/` folder
|
||||
- The file containing the SQL code to create the tables with the correct schema must be in the `XLII_dataset/create_tables.sql` file
|
||||
|
||||
Obviously, you can replace `XLII` with any dataset name you want (I used `flight_delay` and `SSB`).
|
||||
|
||||
Then, if you run `make reset`, an SQLite database file named `XLII_dataset/XLII.db` will be created / overwritten. It will be initialized with the schema given in `XLII_dataset/create_tables.sql`, and populated with the data available in the `csv/*.csv` files.
|
||||
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user