graphing dorm movements in neo4j


graphing dorm movements in neo4j



I'm really struggling getting my head around neo4j and was hoping someone might be able to help point me in the right direction with the below.



Basically, I have a list of what can be referred to as events; the event can be said to describe a patient entering and leaving a room.



Each event has a unique identifier; it also has an identifier for the student in question along with start and end times (e.g. the student entered the room at 12:00 and left at 12:05) and an identifier for the room.



The event and data might look along the lines of the below, columns separated by a pipe delimiter


ID|SID|ROOM|ENTERS|LEAVES
1|1|BLUE|1/01/2015 11:00|4/01/2015 10:19
2|2|GREEN|1/01/2015 12:11|1/01/2015 12:11
3|2|YELLOW|1/01/2015 12:11|1/01/2015 12:20
4|2|BLUE|1/01/2015 12:20|5/01/2015 10:48
5|3|GREEN|1/01/2015 18:41|1/01/2015 18:41
6|3|YELLOW|1/01/2015 18:41|1/01/2015 21:00
7|3|BLUE|1/01/2015 21:00|9/01/2015 9:30
8|4|BLUE|1/01/2015 19:30|3/01/2015 11:00
9|5|GREEN|2/01/2015 19:08|2/01/2015 19:08
10|5|ORANGE|2/01/2015 19:08|3/01/2015 2:43
11|5|PURPLE|3/01/2015 2:43|4/01/2015 16:44
12|6|GREEN|3/01/2015 11:52|3/01/2015 11:52
13|6|YELLOW|3/01/2015 11:52|3/01/2015 17:45
14|6|RED|3/01/2015 17:45|7/01/2015 10:00



Questions that might be asked could be:



what rooms have student x visited and in what order



what does the movement of students between rooms look like - to which room does students go to when they leave room y



That sounds simple enough but I'm tying myself into knots.



I started off creating unique constraints for both student and room



create constraint on (student: Student) assert student.id is unique



I then did the same for room.



I then loaded student as



using periodic commit 1000 load csv with headers from 'file://c:/event.csv' as line merge (s:Student {id: line.SID});



I also did the same for room and visits.



I have absolutely no idea how to create the relationships though to be able to answer the above questions though. Each event lists the time the student enters and leaves the room but not the room the student went to. Starting with the extract, should the extract be changed so that it contains the room the student left for? If someone could help talk through how I need to think of the relationships that needs to be created, that would be very much appreciated.



Cheers




1 Answer
1



As the popular saying goes, there is more than one way to skin an Ouphe - or thwart a mage. One way you could do it (which makes for the simplest modeling imo) is as follows :


CREATE CONSTRAINT ON (s:Student) ASSERT s.studentID IS UNIQUE;
CREATE CONSTRAINT ON (r:Room) ASSERT r.roomID IS UNIQUE;

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///dorm.csv" as line fieldterminator '|'
MERGE (s:Student {studentID: line.SID})
MERGE (r:Room {roomID: line.ROOM})
CREATE (s)-[:VISIT {starttime: apoc.date.parse(line.ENTERS,'s',"dd/MM/yyyy HH:mm"), endtime: apoc.date.parse(line.LEAVES,'s',"dd/MM/yyyy HH:mm")}]->(r);

# What rooms has student x visited and in what order
MATCH (s:Student {studentID: "2"})-[v:VISIT]->(r:Room)
RETURN r.roomID,v.starttime ORDER BY v.starttime;

# What does the movement of students between rooms look like - to which room does students go to when they leave room y
MATCH (s:Student)-[v:VISIT]->(r:Room {roomID: "GREEN"})
WITH s, v
MATCH (s)-[v2:VISIT]->(r2:Room)
WHERE v2.starttime > v.endtime
RETURN s.studentID, r2.roomID, v2.starttime ORDER BY s.studentID, v2.starttime;



So actually you would only have Student and Room as nodes and a student VISITing a room would make up the relationship (with the enter/leave times as properties of that relationship). Now, as you can see that might not be ideal for your second query (although it does work). One alternative is to have a Visit node and chain it (as timeline events) to both Students and Rooms. There's plenty of examples around on how to do that.



Hope this helps,
Tom





Excellent, thanks heaps Tom! I don't have apoc but added a field for date/time as a number to be able to sort it. Also came across a series of blog posts describing the analysis of web site traffic and the order pages are visited, which looks like the second approach that you mention. Anyway, thanks for all your help, much appreciated!
– c95mbq
May 22 '17 at 0:03






You're welcome. Two (small) remarks. Adding APOC will give you a lot more capabilities. It is strictly true that it is not part of the Neo4j release, however it is maintained (and kept in step with the releases) by the best-of-the-best of the Neo4j team. Secondly, when doing the queries I noticed that your times are not granular enough. It seemed possible that somebody entered a room, left a room and entered the next room all at the same "moment" (which of course is impossible). That might mess up the more complex queries.
– Tom Geudens
May 22 '17 at 4:43





Thanks Tom. I'll definitely look at apoc but am that green I can barely start neo at the moment :-) or is that :-/
– c95mbq
May 22 '17 at 10:22





Can you please explain why this doesn't work USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///dorm.csv" as line fieldterminator '|' MERGE (s:Student {studentID: line.SID}) MERGE (r:Room {roomID: line.ROOM}) MERGE (v:Visit {visitID: line.ID, studentID: line.SID, roomID: line.ROOM}) USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///dorm.csv" as line fieldterminator '|' match (s:Student {studentID:line.SID}), (r:Room {roomID: line.roomID}), (v:Visit {visitID:line.ID}) create (s)-[:VERB]->(v)-[:OBJECT]->(r)
– c95mbq
May 22 '17 at 10:29





As far as APOC goes, download the latest jar for your Neo4j version, drop it in the plugins-directory, restart neo4j and your all set :-).
– Tom Geudens
May 22 '17 at 10:44






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

List of Kim Possible characters

Audio Livestreaming with Python & Flask

NSwag: Generate C# Client from multiple Versions of an API