chart_me.charting_assembly_strategy.bivariate

Default implementation for bivariate variable charting

This module contains all the logic to go from Metadata to actual charts See supporting documentation that discusses the rules engine.

Typical usage example:

charts = assemble_bivariate_charts(df, [col1, col2], infered_data_types)

Classes

ChartMeDataTypeMetaType

Defines Normalized Metadata about Datatypes

InferedDataTypes

Core Object Returned for Assembler - column level specifications

Functions

pd_group_me(→ pandas.DataFrame)

A generic function to do group by aggregation in pandas

pd_truncate_date(→ pandas.Series)

Utility to make dates YY--MM--01 to Strings

assemble_bivariate_charts(→ List[Union[altair.Chart, ...)

Delegated Function to Manage Bivariate Use Cases

build_scatter_plot(→ altair.Chart)

An implementation of scatter plot

build_hbar_value(→ altair.Chart)

An implementation of horizontal bar chart

build_facet_histogram(→ altair.Chart)

An implementation of histogram faceted by nominal variable

build_facet_hbars(→ altair.Chart)

An implementation of horizontal bar graph faceted by nominal variable

build_hconcat_temp_charts(→ altair.HConcatChart)

An implementation of horizontal bar graph faceted by nominal variable

build_heatmap(→ altair.Chart)

An implementation of heatmap

build_hconcat_temp_lc_charts(→ altair.HConcatChart)

An implementation that returns two charts to trend nominal and relative values

Module Contents

class chart_me.charting_assembly_strategy.bivariate.ChartMeDataTypeMetaType(*args, **kwds)[source]

Bases: enum.Enum

Defines Normalized Metadata about Datatypes

Defines Key, Boolean, Quantitative, Categorical High Cardinality, Categorical Low Cardinality, Temporal, Not Supported

KEY = ('K',)
BOOLEAN = ('B',)
QUANTITATIVE = ('Q',)
CATEGORICAL_HIGH_CARDINALITY = ('C-HC',)
CATEGORICAL_LOW_CARDINALITY = ('C-LC',)
TEMPORAL = ('T',)
NOT_SUPPORTED_TYPE = 'NA'
class chart_me.charting_assembly_strategy.bivariate.InferedDataTypes[source]

Core Object Returned for Assembler - column level specifications

preaggregated: bool
chart_me_data_types: Dict[str, ChartMeDataType]
chart_me_data_types_meta: Dict[str, ChartMeDataTypeMetaType]
chart_me.charting_assembly_strategy.bivariate.pd_group_me(df: pandas.DataFrame, cols: List[str] | str, agg_dict: Dict, is_temporal: bool = False, make_long_form=False) pandas.DataFrame[source]

A generic function to do group by aggregation in pandas

helpful url: https://jamesrledoux.com/code/group-by-aggregate-pandas WARNING: Hard code logic to return var_name to “measures”

Parameters:
  • df – data

  • cols – grouping columns

  • agg_dict – aggregation dictionary: e.g. {‘Age’: [‘mean’, ‘min’, ‘max’]}

  • is_temporal – boolean flag used to set ‘order’ by Dates versus Counts

  • make_long_form – leverages reset_index and defaults

Returns:

Returns tidy dataframe with default names

Return type:

pd.DataFrame

chart_me.charting_assembly_strategy.bivariate.pd_truncate_date(df: pandas.DataFrame, col: str) pandas.Series[source]

Utility to make dates YY–MM–01 to Strings

Helpful urls: https://predictivehacks.com/?all-tips=how-to-truncate-dates-to-month-in-pandas # noqa: E501 Helpful urls: https://pandas.pydata.org/docs/reference/api/pandas.Series.dt.to_period.html # noqa: E501

Parameters:
  • df – dataframe

  • col – column name of date to truncate

Returns:

returns a Series of “string” datatypes

Return type:

pd.Series

chart_me.charting_assembly_strategy.bivariate.assemble_bivariate_charts(df: pandas.DataFrame, cols: List[str], infered_data_types: chart_me.datatype_infer_strategy.InferedDataTypes, **kwargs) List[altair.Chart | altair.HConcatChart][source]

Delegated Function to Manage Bivariate Use Cases

Parameters:
  • df – dataframe

  • cols – a list of two columns

  • infered_data_types – An instance of InferedDataTypes object

Returns:

List of altair charts or compounds charts

Raises:

ValueError if called with len of cols != 2

chart_me.charting_assembly_strategy.bivariate.build_scatter_plot(df: pandas.DataFrame, col_name1: str, col_name2: str) altair.Chart[source]

An implementation of scatter plot

chart_me.charting_assembly_strategy.bivariate.build_hbar_value(df: pandas.DataFrame, col_name_x: str, col_name_y: str) altair.Chart[source]

An implementation of horizontal bar chart

chart_me.charting_assembly_strategy.bivariate.build_facet_histogram(df: pandas.DataFrame, col_name_facet: str, col_name_hist_q: str) altair.Chart[source]

An implementation of histogram faceted by nominal variable

chart_me.charting_assembly_strategy.bivariate.build_facet_hbars(df: pandas.DataFrame, col_name_facet: str, col_name_y: str, col_name_x: str) altair.Chart[source]

An implementation of horizontal bar graph faceted by nominal variable

chart_me.charting_assembly_strategy.bivariate.build_hconcat_temp_charts(df: pandas.DataFrame, col_name_y_m, col_name_q, col_name_measure: str = 'measures') altair.HConcatChart[source]

An implementation of horizontal bar graph faceted by nominal variable

WARNING: Function assumes processing through pd_group_me to separate count from other aggregations

chart_me.charting_assembly_strategy.bivariate.build_heatmap(df: pandas.DataFrame, col_name_x: str, col_name_y: str, col_name_q: str) altair.Chart[source]

An implementation of heatmap

chart_me.charting_assembly_strategy.bivariate.build_hconcat_temp_lc_charts(df: pandas.DataFrame, col_name_t_y_m: str, col_name_n: str, col_name_q: str) altair.HConcatChart[source]

An implementation that returns two charts to trend nominal and relative values

chart 1-> bar trend chart chart 2-> stacked bar chart