Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(readwriter): Implement File Format Reader/Writer (#72)
* feat: Implement CSV Options Configuration for DataFrameReader (#53) - Added CsvOptions struct to support CSV read options like `header`, `delimiter`, and `nullValue`. - Implemented ConfigOpts trait for CsvOptions to convert options into key-value pairs. - Updated DataFrameReader to include `csv` method that accepts CsvOptions. * feat: Implement CSV Options Configuration for DataFrameReader (#54) - Added documentation for the CsvOptions struct. * test(readwriter): Implement test_dataframe_read_csv_with_options (#54) * refactor: Improve CSV method to handle multiple paths (#54) - Updated the csv method in DataFrameReader to support both single string slices and arrays of string slices as input paths. * feat: Added implementations for JSON Options struct (#54) * feat: Implement JSON Options Configuration for DataFrameReader (#54) - Added JsonOptions struct to support JSON read options like `schema`, `multi_line`, `encoding`, and more. - Implemented ConfigOpts trait for JsonOptions to convert options into key-value pairs. - Updated DataFrameReader to include `json` method that accepts JsonOptions. - Documented all available JSON options, including example usage for setting options when reading JSON files. [TO DO] - Write tests to validate JSON options functionality. * feat: Implement ORC Options Configuration for DataFrameReader (#54) - Example usage provided for setting ORC options when reading files. - Write tests to validate ORC options functionality. * feat: Implement Parquet Options Configuration for DataFrameReader (#54) - Added ParquetOptions struct to support Parquet read options like `mergeSchema`, `pathGlobFilter`, and `recursiveFileLookup`. - Implemented ConfigOpts trait for ParquetOptions to convert options into key-value pairs. - Updated DataFrameReader to include `parquet` method that accepts ParquetOptions. - Example usage provided for setting Parquet options when reading files. - Write tests to validate Parquet options functionality. * feat: Implement Text Options Configuration for DataFrameReader (#54) - Added TextOptions struct to support text read options like `wholetext`, `lineSep`, and `pathGlobFilter`. - Implemented ConfigOpts trait for TextOptions to convert options into key-value pairs. - Updated DataFrameReader to include `text` method that accepts TextOptions. - Example usage provided for setting text options when reading files. - Write tests to validate text options functionality. * feat: Implement Text and Parquet Options Configuration for DataFrameWriter (#54) - Added TextOptions struct to support text write options such as `whole_text` and `line_sep`. - Added ParquetOptions struct to support Parquet write options like `merge_schema`, `path_glob_filter`, and `datetime_rebase_mode`. - Implemented `write` method in DataFrameWriter to handle configuration for text and Parquet file formats. - Example usage provided for setting text and Parquet options when writing DataFrames. - Write tests to validate text and Parquet file writing functionality. * Added rustdocs to method implementations. * feat: Implement initial methods for file format reader and writer (#54) - Added support for reading and writing .csv, .json, .orc, .parquet, and .text file formats. - Created `ConfigOpts` trait for each file type to manage options in a structured way. - Added example method signatures for file reading using a configurable options object passed into methods. * Add missing csv options to CsvOptions. * feat: Implement Configuration Options for DataFrameReader and Writer (#54) - Implemented additional fields in ParquetOptions compression. - Updated test_dataframe_read_parquet_with_options to ensure valid compression codec usage. - Enhanced test_dataframe_read_text_with_options to properly read lines by setting line_sep and disabling whole_text. - Implemented the #[derive(Debug, Clone)] traits for all Option structs. - Updated expected path_glob_filter type to string. - Added the compression field to ParquetOptions, OrcOptions, and JsonOptions. - Updated documentation for all Options structs to include descriptions for new and existing fields. * feat: Refactor file format options with shared CommonFileOptions (#54) - Introduced CommonFileOptions to handle common configuration fields such as: - path_glob_filter - recursive_file_lookup - ignore_corrupt_files - ignore_missing_files - modified_before - modified_after - Updated CsvOptions, JsonOptions, OrcOptions, ParquetOptions, and TextOptions to use CommonFileOptions for the shared fields. - Updated the new() constructors for each file format options struct to initialize CommonFileOptions. - Refactored tests for each file format (e.g., ORC, CSV) to utilize the new CommonFileOptions, ensuring that both format-specific and shared options are properly tested. - Updated and verified tests for DataFrame reading and writing operations with updated options. * Updated rustdocs. * Updated typo in rustdocs: /// - - Common file options... * Updated README - DataFrameReader/Writer section. --------- Co-authored-by: lexara-prime-ai <irfanghta@gmail.com>
- Loading branch information