PostgreSQL Function for Cleaning and Updating Data

In any database, maintaining clean and consistent data is crucial for accurate data analysis and reporting. In this article, we will demonstrate how to create a PostgreSQL function that cleans and updates customer data. Specifically, we will trim whitespace, convert email addresses to lowercase, and ensure phone numbers contain only numerical digits. This will ensure that our customer data remains clean and well-formatted.

Step 1: Create the Customer Table and Insert Data

First, we need to create the customer table and insert some sample data. This table will include columns for customer information such as customer_idfirst_namelast_nameemailphone, and address.

CREATE TABLE customer (
    customer_id UUID PRIMARY KEY,
    first_name TEXT,
    last_name TEXT,
    email TEXT UNIQUE,
    phone TEXT,
    address TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO customer (customer_id, first_name, last_name, email, phone, address) VALUES
('b1a4d84a-9f6f-4d67-8a93-6cabc213d6de', 'John', 'Doe', 'John.Doe@example.com ', '123-456-7890', '123 Maple St'),
('d28d8f15-5e6c-4c71-a8b2-9c6a13233f19', 'Jane', 'Smith', 'jane.smith@example.com', '987-654-3210', '456 Oak St'),
('e9d8b7d9-8176-4c5b-bcb5-a35e7a1e6d32', 'Alice', 'Johnson', 'alice.johnson@example.com', '555-123-4567', '789 Pine St');

Step 2: Create the Cleaning Function

Next, we will create a PostgreSQL function named clean_and_update_customer_data. This function will take a customer_id as input and perform the following operations on the corresponding record:

  • Trim whitespace from the email and phone fields.
  • Convert the email field to lowercase.
  • Remove all non-numeric characters from the phone field.
CREATE OR REPLACE FUNCTION clean_and_update_customer_data(p_customer_id UUID)
RETURNS VOID AS $$
BEGIN
    UPDATE customer
    SET
        email = LOWER(TRIM(email)),
        phone = REGEXP_REPLACE(TRIM(phone), '[^0-9]', '', 'g')
    WHERE
        customer_id = p_customer_id;
END;
$$ LANGUAGE plpgsql;

Step 3: Execute the Function

To clean and update the data for a specific customer, call the function with the appropriate customer_id. For example, to clean the data for the customer with customer_id ‘b1a4d84a-9f6f-4d67-8a93-6cabc213d6de’, use the following SQL statement:

SELECT clean_and_update_customer_data('b1a4d84a-9f6f-4d67-8a93-6cabc213d6de');

Step 4: Verify the Updates

You can verify the updates by querying the customer table:

SELECT customer_id, email, phone FROM customer WHERE customer_id = 'b1a4d84a-9f6f-4d67-8a93-6cabc213d6de';

By following these steps, you can create a function in PostgreSQL that ensures your customer data is clean and well-formatted. This function trims whitespace, converts email addresses to lowercase, and removes non-numeric characters from phone numbers. Maintaining clean data helps in accurate analysis and reporting, making your database more reliable and effective.

With this approach, you can easily update multiple customer records by calling the function with different customer_id values, ensuring consistency and cleanliness across your dataset.

If you found this article helpful, please give it a like ⭐️ and leave a comment below! Your feedback and engagement help me create more content that benefits the community. Feel free to share your thoughts and any questions you might have. Happy coding!

Leave a comment