Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement File Format Reader/Writer #72

Merged
merged 19 commits into from
Oct 11, 2024
Merged

Commits on Aug 27, 2024

  1. feat: Implement CSV Options Configuration for DataFrameReader (sjruss…

    …o8#53)
    
    - Added CsvOptions struct to support CSV read options like `header`, `delimiter`, and `nullValue`.
    - Implemented ConfigOpts trait for CsvOptions to convert options into key-value pairs.
    - Updated DataFrameReader to include `csv` method that accepts CsvOptions.
    lexara-prime-ai committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    990299a View commit details
    Browse the repository at this point in the history
  2. feat: Implement CSV Options Configuration for DataFrameReader (sjruss…

    …o8#54)
    
    - Added documentation for the CsvOptions struct.
    lexara-prime-ai committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    738e35a View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2024

  1. Configuration menu
    Copy the full SHA
    94d54d3 View commit details
    Browse the repository at this point in the history

Commits on Sep 3, 2024

  1. refactor: Improve CSV method to handle multiple paths (sjrusso8#54)

        - Updated the csv method in DataFrameReader to support both single string slices and arrays of string slices as input paths.
    lexara-prime-ai committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    565cace View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0bb27c2 View commit details
    Browse the repository at this point in the history

Commits on Sep 4, 2024

  1. feat: Implement JSON Options Configuration for DataFrameReader (sjrus…

    …so8#54)
    
    - Added JsonOptions struct to support JSON read options like `schema`, `multi_line`, `encoding`, and more.
    - Implemented ConfigOpts trait for JsonOptions to convert options into key-value pairs.
    - Updated DataFrameReader to include `json` method that accepts JsonOptions.
    - Documented all available JSON options, including example usage for setting options when reading JSON files. [TO DO]
    - Write tests to validate JSON options functionality.
    lexara-prime-ai committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    b110961 View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2024

  1. feat: Implement ORC Options Configuration for DataFrameReader (sjruss…

    …o8#54)
    
    - Example usage provided for setting ORC options when reading files.
    - Write tests to validate ORC options functionality.
    lexara-prime-ai committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    76ab23a View commit details
    Browse the repository at this point in the history
  2. feat: Implement Parquet Options Configuration for DataFrameReader (sj…

    …russo8#54)
    
    - Added ParquetOptions struct to support Parquet read options like `mergeSchema`, `pathGlobFilter`, and `recursiveFileLookup`.
    - Implemented ConfigOpts trait for ParquetOptions to convert options into key-value pairs.
    - Updated DataFrameReader to include `parquet` method that accepts ParquetOptions.
    - Example usage provided for setting Parquet options when reading files.
    - Write tests to validate Parquet options functionality.
    lexara-prime-ai committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    bb08370 View commit details
    Browse the repository at this point in the history
  3. feat: Implement Text Options Configuration for DataFrameReader (sjrus…

    …so8#54)
    
    - Added TextOptions struct to support text read options like `wholetext`, `lineSep`, and `pathGlobFilter`.
    - Implemented ConfigOpts trait for TextOptions to convert options into key-value pairs.
    - Updated DataFrameReader to include `text` method that accepts TextOptions.
    - Example usage provided for setting text options when reading files.
    - Write tests to validate text options functionality.
    lexara-prime-ai committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    e7f54aa View commit details
    Browse the repository at this point in the history
  4. feat: Implement Text and Parquet Options Configuration for DataFrameW…

    …riter (sjrusso8#54)
    
    - Added TextOptions struct to support text write options such as `whole_text` and `line_sep`.
    - Added ParquetOptions struct to support Parquet write options like `merge_schema`, `path_glob_filter`, and `datetime_rebase_mode`.
    - Implemented `write` method in DataFrameWriter to handle configuration for text and Parquet file formats.
    - Example usage provided for setting text and Parquet options when writing DataFrames.
    - Write tests to validate text and Parquet file writing functionality.
    lexara-prime-ai committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    43d0db4 View commit details
    Browse the repository at this point in the history
  5. Added rustdocs to method implementations.

    lexara-prime-ai committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    836f0e4 View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2024

  1. feat: Implement initial methods for file format reader and writer (sj…

    …russo8#54)
    
    - Added support for reading and writing .csv, .json, .orc, .parquet, and .text file formats.
    - Created `ConfigOpts` trait for each file type to manage options in a structured way.
    - Added example method signatures for file reading using a configurable options object passed into methods.
    lexara-prime-ai committed Sep 6, 2024
    Configuration menu
    Copy the full SHA
    06af6ff View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2024

  1. Configuration menu
    Copy the full SHA
    0b468b7 View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2024

  1. feat: Implement Configuration Options for DataFrameReader and Writer (s…

    …jrusso8#54)
    
        - Implemented additional fields in ParquetOptions compression.
        - Updated test_dataframe_read_parquet_with_options to ensure valid compression codec usage.
        - Enhanced test_dataframe_read_text_with_options to properly read lines by setting line_sep and disabling whole_text.
        - Implemented the #[derive(Debug, Clone)] traits for all Option structs.
        - Updated expected path_glob_filter type to string.
        - Added the compression field to ParquetOptions, OrcOptions, and JsonOptions.
        - Updated documentation for all Options structs to include descriptions for new and existing fields.
    lexara-prime-ai committed Sep 22, 2024
    Configuration menu
    Copy the full SHA
    8dff095 View commit details
    Browse the repository at this point in the history
  2. feat: Refactor file format options with shared CommonFileOptions (sjr…

    …usso8#54)
    
        - Introduced CommonFileOptions to handle common configuration fields such as:
        - path_glob_filter
        - recursive_file_lookup
        - ignore_corrupt_files
        - ignore_missing_files
        - modified_before
        - modified_after
    
        - Updated CsvOptions, JsonOptions, OrcOptions, ParquetOptions, and TextOptions
        to use CommonFileOptions for the shared fields.
    
        - Updated the new() constructors for each file format options struct to initialize
        CommonFileOptions.
    
        - Refactored tests for each file format (e.g., ORC, CSV) to utilize the new
        CommonFileOptions, ensuring that both format-specific and shared options
        are properly tested.
    
        - Updated and verified tests for DataFrame reading and writing operations with updated options.
    lexara-prime-ai committed Sep 22, 2024
    Configuration menu
    Copy the full SHA
    99c63cc View commit details
    Browse the repository at this point in the history
  3. Updated rustdocs.

    lexara-prime-ai committed Sep 22, 2024
    Configuration menu
    Copy the full SHA
    8600b2d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d4bdefc View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2024

  1. Configuration menu
    Copy the full SHA
    e82c20c View commit details
    Browse the repository at this point in the history

Commits on Oct 11, 2024

  1. Configuration menu
    Copy the full SHA
    c6ba149 View commit details
    Browse the repository at this point in the history