Delete Duplicate Emails - SQL Practical Question

Delete Duplicate Emails is an easy and tricky question asked in many interviews. We are going to solve this question with very simple self-joins.

Let's start with having a table named Person containing emails,

SQL Schema for Person table:

CREATE TABLE Person (Id int, Email varchar(255))
INSERT INTO Person (id, email) values ('1', 'john@example.com')
INSERT INTO Person (id, email) values ('2', 'bob@example.com')
INSERT INTO Person (id, email) values ('3', 'john@example.com')

Id	Email
1	john@example.com
2	bob@example.com
3	john@example.com

Write an SQL query to delete all the duplicate emails, keeping only one unique email with the smallest id. Note that you are supposed to write a DELETE statement and not a SELECT one.

The query result format is following:

We need to have output:

Id	Email
1	john@example.com
2	bob@example.com

So the question is clear and the output is clear. Now we will start making a query to get the output:

First, let's join tables with self-join on the basis of Email:

SELECT p1.* FROM PERSON p1 JOIN Person p2 WHERE p1.Email=p2.Email;

The output will look like this,

Id	Email
3	john@example.com
1	john@example.com
2	bob@example.com
3	john@example.com
1	john@example.com

Now, we will need to find the bigger id having the same email address as other records. We will be going to another condition so the query will look like this:

SELECT p1.* FROM PERSON p1 JOIN Person p2 WHERE p1.Email=p2.Email AND p1.id>p2.id;

So, the output will be: