`surveyweathertool.src.survey.harmonizer`¶

Module Contents¶

Functions¶

`prepare_files_descriptions_data`(→ dict)	Prepares data descriptions for files from a JSON file.
`write_dict_to_file`(data, output_path, filename[, ...])	Writes a dictionary to a file in the specified format.
`add_extract_dict_to_excel`(→ tuple)	Adds a dictionary to an existing Excel workbook in the specified sheet.
`get_all_harmonized_files_dictionary`(→ Dict[str, str])	Retrieves a dictionary containing all harmonized files in a folder.
`preprocessing_transformer`(→ pandas.DataFrame)	Preprocesses the given dataframe by dropping duplicate rows, dropping NaN values, extracting year from 'wave' column,
`prepare_concatenated_data`(...)	Prepares concatenated dataframes based on the given filenames and their associated indicators.
`indicator_merger`(→ pandas.DataFrame)	Merge and process a list of DataFrames containing indicator data.
`harmonize`()	Runs all setup on the harmonize dataset and save for all selected indicators_

surveyweathertool.src.survey.harmonizer.prepare_files_descriptions_data(json_file_path: str) → dict¶

Prepares data descriptions for files from a JSON file.

Args:: json_file_path (str): The path of the JSON file.
Returns:: dict: The prepared data descriptions for files.

surveyweathertool.src.survey.harmonizer.write_dict_to_file(data, output_path, filename, file_type='xlsx')¶

Writes a dictionary to a file in the specified format.

Args:: data: The dictionary to write to the file. output_path (str): The path of the output directory. filename (str): The name of the output file. file_type (str, optional): The file format to use (“csv”, “xlsx”, or “json”). Defaults to “xlsx”.
Returns:: None

surveyweathertool.src.survey.harmonizer.add_extract_dict_to_excel(workbook_path: str, data: dict, excluding_columns: list, sheet_name: str = 'Indicator-Domain - Indicator', column1: str = 'Indicator_Domain', column2: str = 'Indicator') → tuple¶

Adds a dictionary to an existing Excel workbook in the specified sheet.

Args:: workbook_path (str): The path of the Excel workbook. data (dict): The dictionary to add to the workbook. excluding_columns (list): The list of columns to exclude. sheet_name (str, optional): The name of the sheet to add the data. Defaults to “Indicator-Domain - Indicator”. column1 (str, optional): The column header for Indicator_Domain. Defaults to “Indicator_Domain”. column2 (str, optional): The column header for Indicator. Defaults to “Indicator”.
Returns:: tuple: A tuple containing the indicator_domain_mapping, columns_filename_lookup, and file_columns_question_lookup.

surveyweathertool.src.survey.harmonizer.get_all_harmonized_files_dictionary(folder_path: str, file_extension_choices: List[str]) → Dict[str, str]¶

Retrieves a dictionary containing all harmonized files in a folder.

Args:: folder_path (str): The path to the folder containing the files. file_extension_choices (List[str]): The list of file extensions to consider.
Returns:: Dict[str, str]: A dictionary mapping file names to their full paths.

surveyweathertool.src.survey.harmonizer.preprocessing_transformer(df: pandas.DataFrame, primary_columns: List[str], columns_to_check: List[str] = None) → pandas.DataFrame¶

Preprocesses the given dataframe by dropping duplicate rows, dropping NaN values, extracting year from ‘wave’ column, and converting selected columns to specific data types.

Parameters:: df (pd.DataFrame): The input dataframe. primary_columns (List[str]): A list of primary columns to be used for preprocessing. columns_to_check (List[str], optional): A list of columns to check for NaN values. Defaults to None.
Returns:: pd.DataFrame: The preprocessed dataframe.

surveyweathertool.src.survey.harmonizer.prepare_concatenated_data(filenames_indicator: Dict[str, set], files_paths: Dict[str, str], geodata_path: str) → List[pandas.DataFrame] | None¶

Prepares concatenated dataframes based on the given filenames and their associated indicators.

Parameters:: filenames_indicator (Dict[str, set]): A dictionary where keys are filenames and values are sets of indicators. files_paths (Dict[str, str]): A dictionary where keys are filenames and values are file paths. geodata_path (str): The file path of the geolocation data.
Returns:: List[pd.DataFrame] or None: A list of concatenated dataframes.

surveyweathertool.src.survey.harmonizer.indicator_merger(concatenated_data: List[pandas.DataFrame]) → pandas.DataFrame¶

Merge and process a list of DataFrames containing indicator data.

Args:: concatenated_data (List[pd.DataFrame]): A list of DataFrames to be merged.
Returns:: pd.DataFrame: The merged DataFrame after processing.

surveyweathertool.src.survey.harmonizer.harmonize()¶: Runs all setup on the harmonize dataset and save for all selected indicators_

surveyweathertool.src.survey.harmonizer¶

Module Contents¶

Functions¶

`surveyweathertool.src.survey.harmonizer`¶