Combining “like” rows
Combining “like” rows
I have a table of locations_from, locations_to, and count of flights. I want to combine the rows where departure on one row equals arrival on the other (for example LA TO NY combined with NY TO LA) and SUM the like rows.
I think it would be better explained with an example.
BEFORE
locations_from locations_to # of Flights
--------------------------------------------------
San Francisco, CA Los Angeles, CA 29558
Los Angeles, CA San Francisco, CA 32389
New York, NY Los Angeles, CA 30389
Los Angeles, CA New York, NY 35484
Las Vegas, NV Los Angeles, CA 28363
Los Angeles, CA Las Vegas, NV 34455
Honolulu, HI Kahului, HI 46563
Kahului, HI Honolulu, HI 16879
San Francisco, CA New York, NY 44654
New York, NY San Francisco, CA 25882
AFTER
From/To From/To # of Flights
---------------------------------------------------
San Francisco, CA Los Angeles, CA 61947
New York, NY Los Angeles, CA 65873
Las Vegas, NV Los Angeles, CA 62818
Honolulu, HI Kahului, HI 63442
San Francisco, CA New York, NY 70536
I have tried a cross join on itself
where a.locations_from = b.locations_to
this works but I end up getting double the rows (IE one for LA TO NY and also one for NY TO LA)
yes, that's correct
– pluck
Jul 1 at 7:05
Can you add a table schema. I am basically looking for a primary key in your table. So basically besides the from, to and no of flights column, do you have a primary key column?
– Ankur Patel
Jul 1 at 7:11
3 Answers
3
Simply Use CASE ... END
to "sort" the airports and then group by that "sorted" pair.
CASE ... END
SELECT CASE
WHEN [locations_from] > [locations_to]
THEN [locations_to]
ELSE
[locations_from]
END [From/To],
CASE
WHEN [locations_from] > [locations_to]
THEN [locations_from]
ELSE
[locations_to]
END [From/To],
sum([# of Flights]) [# of Flights]
FROM elbat
GROUP BY CASE
WHEN [locations_from] > [locations_to]
THEN [locations_to]
ELSE
[locations_from]
END,
CASE
WHEN [locations_from] > [locations_to]
THEN [locations_from]
ELSE
[locations_to]
END;
SQL Fiddle
And here is also an alternative solution using a FULL JOIN
on the opposite direction.
FULL JOIN
SELECT coalesce(t1.[locations_from], t2.locations_from) [locations_from],
coalesce(t1.[locations_to], t2.locations_from) [locations_to],
coalesce(t1.[# of Flights], 0) + coalesce(t2.[# of Flights], 0) [# of Flights]
FROM elbat t1
FULL JOIN elbat t2
ON t2.[locations_from] = t1.[locations_to]
AND t2.[locations_to] = t1.[locations_from]
WHERE (t1.[locations_from] IS NULL
AND t1.[locations_to] IS NULL
OR t1.[locations_from] < t1.[locations_to])
AND (t2.[locations_from] IS NULL
AND t2.[locations_to] IS NULL
OR t2.[locations_from] > t2.[locations_to]);
SQL Fiddle
@HABO: Works for me. SQL Fiddle It gives me 84 = 42 + 42 flights in total. I don't see any fun.
– sticky bit
Jul 1 at 17:09
Apologies, my bad. Apparently I can't read all the way across an output row.
– HABO
Jul 1 at 20:08
Using union
Declare @YourTable Table (SomeRowID int,[locations_from] varchar(50),[locations_to] varchar(50),[# of Flights] int)
Insert Into @YourTable Values
(1,'San Francisco, CA','Los Angeles, CA',29558)
,(2,'Los Angeles, CA','San Francisco, CA',32389)
,(3,'New York, NY','Los Angeles, CA',30389)
,(4,'Los Angeles, CA','New York, NY',35484)
,(5,'Las Vegas, NV','Los Angeles, CA',28363)
,(6,'Los Angeles, CA','Las Vegas, NV',34455)
,(7,'Honolulu, HI','Kahului, HI',46563)
,(8,'Kahului, HI','Honolulu, HI',16879)
,(9,'San Francisco, CA','New York, NY',44654)
,(10,'New York, NY','San Francisco, CA',25882);
select [locations_from], [locations_to], sum([# of Flights])
from
(
select [locations_from], [locations_to], [# of Flights]
from @YourTable
where [locations_from] < [locations_to]
union all
select [locations_to], [locations_from], [# of Flights]
from @YourTable
where [locations_from] > [locations_to]
) t
group by [locations_from], [locations_to]
Yup. Even better. +1 However, one should <= or >= just in case of an emergency turn-around NY to NY :)
– John Cappelletti
13 hours ago
Just another option using a CROSS APPLY
and a conditional aggregation
CROSS APPLY
This examples assumes you have some sort of RowID
Example
Declare @YourTable Table (SomeRowID int,[locations_from] varchar(50),[locations_to] varchar(50),[# of Flights] int)
Insert Into @YourTable Values
(1,'San Francisco, CA','Los Angeles, CA',29558)
,(2,'Los Angeles, CA','San Francisco, CA',32389)
,(3,'New York, NY','Los Angeles, CA',30389)
,(4,'Los Angeles, CA','New York, NY',35484)
,(5,'Las Vegas, NV','Los Angeles, CA',28363)
,(6,'Los Angeles, CA','Las Vegas, NV',34455)
,(7,'Honolulu, HI','Kahului, HI',46563)
,(8,'Kahului, HI','Honolulu, HI',16879)
,(9,'San Francisco, CA','New York, NY',44654)
,(10,'New York, NY','San Francisco, CA',25882)
;with cte as (
Select A.SomeRowID
,Loc1 = min(Loc)
,Loc2 = max(Loc)
,Flights = sum(Val)
From @YourTable A
Cross Apply ( values ([locations_from],[# of Flights])
,([locations_to] ,0)
) B (Loc,Val)
Group By A.SomeRowID
)
Select Loc1
,Loc2
,Flights=sum(Flights)
From cte
Group By Loc1,Loc2
Returns
Loc1 Loc2 Flights
Honolulu, HI Kahului, HI 63442
Las Vegas, NV Los Angeles, CA 62818
Los Angeles, CA New York, NY 65873
Los Angeles, CA San Francisco, CA 61947
New York, NY San Francisco, CA 70536
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Just to clarify, you just need the number of flights between 2 cities. The from/to city doesn't matter, correct?
– Ankur Patel
Jul 1 at 7:01