Data cleaning is a critical step in the data preparation process. Whether it is analytics, business intelligence, or data engineering, clean data ensures more precise and reliable insights. 1️⃣ Convert Text to Lower/Upper Case Ensure consistency in categorical fields like names or categories. -- Convert to lowercase SELECT LOWER(column_name) AS cleaned_column FROM table_name; -- Convert to uppercase SELECT UPPER(column_name) AS cleaned_column FROM table_name; 2️⃣ Remove Extra Spaces from Text Fields Trim leading/trailing spaces using TRIM() SELECT TRIM(column_name) AS cleaned_column FROM table_name; 3️⃣ Convert Date Strings to a Consistent Format Transform text-based dates into a usable date format: SELECT STR_TO_DATE(column_name, '%m/%d/%Y') AS formatted_date FROM table_name; 4️⃣Identify & Manage Outliers Filter numeric values within a defined range: SELECT * FROM table_name WHERE column_name BETWEEN lower_limit AND upper_limit; 5️⃣ Remove Special Characters Strip out unwanted symbols using regular expressions: SELECT REGEXP_REPLACE(column_name, '[^a-zA-Z0-9 ]', '') AS cleaned_column FROM table_name; 6️⃣ Standardize Categorical Values Unify inconsistent text representations: UPDATE table_name SET column_name = 'Male' WHERE column_name IN ('M', 'male'); 7️⃣ Replace NULLs with Default Values SELECT COALESCE(column_name, 'DefaultValue') AS column_name FROM table_name; 8️⃣ Delete Duplicate Rows WITH CTE AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY id) AS row_num FROM table_name ) DELETE FROM table_name WHERE id IN ( SELECT id FROM CTE WHERE row_num > 1 );

Data cleaning is a critical step in the data preparation process. Whether it is analytics, business intelligence, or data engineering, clean data ensures more precise and reliable insights.
1️⃣ Convert Text to Lower/Upper Case
Ensure consistency in categorical fields like names or categories.
-- Convert to lowercase
SELECT LOWER(column_name) AS cleaned_column FROM table_name;
-- Convert to uppercase
SELECT UPPER(column_name) AS cleaned_column FROM table_name;
2️⃣ Remove Extra Spaces from Text Fields Trim leading/trailing spaces using TRIM()
SELECT TRIM(column_name) AS cleaned_column FROM table_name;
3️⃣ Convert Date Strings to a Consistent Format
Transform text-based dates into a usable date format:
SELECT STR_TO_DATE(column_name, '%m/%d/%Y') AS formatted_date FROM table_name;
4️⃣Identify & Manage Outliers
Filter numeric values within a defined range:
SELECT * FROM table_name
WHERE column_name BETWEEN lower_limit AND upper_limit;
5️⃣ Remove Special Characters
Strip out unwanted symbols using regular expressions:
SELECT REGEXP_REPLACE(column_name, '[^a-zA-Z0-9 ]', '') AS cleaned_column
FROM table_name;
6️⃣ Standardize Categorical Values
Unify inconsistent text representations:
UPDATE table_name
SET column_name = 'Male'
WHERE column_name IN ('M', 'male');
7️⃣ Replace NULLs with Default Values
SELECT COALESCE(column_name, 'DefaultValue') AS column_name
FROM table_name;
8️⃣ Delete Duplicate Rows
WITH CTE AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY id) AS row_num
FROM table_name
)
DELETE FROM table_name
WHERE id IN (
SELECT id FROM CTE WHERE row_num > 1
);