Utilize a vertex-centric index

Turn on cache to improve latency

Avoid Vertex Traversal over Edge

    • if a graph needs to jump to V2 from V1 via E1 (2 hops) – to query properties from V2 , then its a very expensive operation.
    • so we should store the ‘most frequently accessed properties’ from V1 and V2 to E1 

model8       

  • Handle failure in Application
    When committing a transaction, Titan will attempt to persist all changes to the storage backend. This might not always be successful due to IO exceptions, network errors, machine crashes or resource unavailability. Rollback of transactions is necessary because only the user knows the transactional boundary. 
  • Check if existence of vertex need to be verified 
    • TransactionBuilder.checkExternalVertexExistence(boolean) determines –  whether this transaction should verify the existence of vertices for user provided vertex ids. Such checks requires access to the database which takes time. The existence check should only be disabled if the user is absolutely sure that the vertex must exist – otherwise data corruption can ensue.
    • TransactionBuilder.checkInternalVertexExistence(boolean) – whether this transaction should double-check the existence of vertices during query execution. This can be useful to avoid phantom vertices on eventually consistent storage backends. Disabled by default. Enabling this setting can slow down query processing.
  • Convert date time into long
  • Enable Batch Loading 
    • TransactionBuilder.enableBatchLoading() – enables batch-loading for an individual transaction.

Data Partition Strategy 

Large scale data analysis

    • Note that Titan stores 2 wide rows (1. adj. list of incident_edges+target_vertices  , 2. adj. list of vertices_edges) .
    • So for some fast real-time computation (aggregation of data over vertices)
      – implement in-memory map-reduce (preferably using SparkComputer) code execution per vertex in parallel 

    • Note – VertexProgram – is a piece of code that is executed at each vertex in logically parallel manner until some termination condition is met (e.g. certain number of iterations have occurred, no more data is changing in the graph, etc.).

Query against Patterns

Text Search (full / predicate)

Advertisements