SQL - Delete Duplicate Records
To delete duplicate rows from a table, you can use the DELETE statement in combination with subqueries or temporary tables. Here's an example of how you can delete duplicate rows using a subquery approach:
Let's assume we have a "Customers" table with the following columns:
CustomerID (unique identifier for each customer)
FirstName (first name of the customer)
LastName (last name of the customer)
Email (email address of the customer)
To delete duplicate rows based on the combination of FirstName and LastName, you can use the following query:
DELETE FROM Customers
WHERE CustomerID NOT IN (
SELECT MIN(CustomerID)
FROM Customers
GROUP BY FirstName, LastName
);
Explanation:
The subquery SELECT MIN(CustomerID) FROM Customers GROUP BY FirstName, LastName retrieves the minimum CustomerID for each combination of FirstName and LastName in the "Customers" table. This means it selects one unique row for each set of duplicate rows.
The outer DELETE statement uses the NOT IN operator to delete rows where the CustomerID is not in the result of the subquery. In other words, it keeps only the rows with the minimum CustomerID for each set of duplicates and deletes the rest.
By executing this query, all duplicate rows based on the combination of FirstName and LastName will be deleted, keeping only one row for each unique combination.
Please note that before performing any deletion operation, it's crucial to backup your data or run the query on a test environment to ensure the desired outcome and avoid accidental data loss. Adjust the table name and column names as per your specific database schema and requirements.