6a

Audit Nulls And Question-Mark Values

Notebook section: Load, audit, and cleaning cells

Requirement: Load the dataset, detect nulls and ? markers, and clean the categorical columns before modeling.

The notebook explicitly records the question-mark counts in workclass, occupation, and native.country, then fills the missing values before downstream analysis.

Results Capture

Question-mark counts are exported in the summary JSON.
The cleaning step leaves zero missing values for the model pipeline.
The cleaned dataframe is used for all plots and model training steps.

df = df.replace(' ?', np.nan).replace('?', np.nan)
for column in object_columns:
    if df[column].isna().any():
        df[column] = df[column].fillna(df[column].mode().iloc[0])

Associated Artifact