[Solved] HUGE performance degradation

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • [Solved] HUGE performance degradation

      I built an ETL project as a test.. loading 5 years of sales data into a cube.. and was very impressed with the performance.. as you can see below.. entire cube generation was < 2 minutes.

      Source Code

      1. 2009-11-27 09:53:59,396 INFO [job LoadCube] (ExecutionState.java:125) - Starting execution of job LoadCube
      2. 2009-11-27 09:53:59,400 INFO [job LoadCube] (ExecutionState.java:130) - Parameter failOnError: true
      3. 2009-11-27 09:53:59,400 INFO [job LoadCube] (ExecutionState.java:130) - Parameter logLimit: 100
      4. 2009-11-27 09:53:59,422 INFO [job LoadCube] (CubeLoad.java:454) - Starting load LoadCube of Cube Sales
      5. 2009-11-27 09:53:59,424 INFO [job LoadCube] (TableTransform.java:147) - Data retrieval from transform SalesCube
      6. 2009-11-27 09:53:59,432 INFO [job LoadCube] (TableSource.java:208) - Data retrieval from extract sales_cube
      7. 2009-11-27 09:54:48,191 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 10000
      8. 2009-11-27 09:54:48,616 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 20000
      9. 2009-11-27 09:54:49,011 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 30000
      10. 2009-11-27 09:54:49,405 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 40000
      11. 2009-11-27 09:54:49,795 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 50000
      12. 2009-11-27 09:54:50,187 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 60000
      13. 2009-11-27 09:54:50,545 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 70000
      14. 2009-11-27 09:54:50,939 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 80000
      15. 2009-11-27 09:54:51,338 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 90000
      16. 2009-11-27 09:54:51,687 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 100000
      17. 2009-11-27 09:54:52,044 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 110000
      18. .
      19. (cut)
      20. .
      21. 2009-11-27 09:55:31,141 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1200000
      22. 2009-11-27 09:55:31,520 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1210000
      23. 2009-11-27 09:55:31,876 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1220000
      24. 2009-11-27 09:55:32,240 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1230000
      25. 2009-11-27 09:55:32,613 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1240000
      26. 2009-11-27 09:55:32,971 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1250000
      27. 2009-11-27 09:55:33,331 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1260000
      28. 2009-11-27 09:55:33,690 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1270000
      29. 2009-11-27 09:55:34,022 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1280000
      30. 2009-11-27 09:55:34,123 INFO [job LoadCube] (CubeLoad.java:118) - Data vectors loaded: 1282380
      31. 2009-11-27 09:55:34,125 INFO [job LoadCube] (CubeLoad.java:459) - Finished load LoadCube of Cube Sales. Filled cells changed from 106115 to 407331.
      32. 2009-11-27 09:55:34,127 INFO [job LoadCube] (ExecutionState.java:148) - Finished execution of job LoadCube with status: Completed successfully
      Display All


      I decided to try to simplify my ETL job.. with the same datasource, this time doing some fieldtransform lookups, and to my surprise performance went HUGELY out of whack.. to the point that I had to kill it. See below.

      Source Code

      1. 2009-11-27 08:36:38,652 INFO [job BuildCube] (ExecutionState.java:130) - Parameter logLimit: 100
      2. 2009-11-27 08:36:38,652 INFO [job BuildCube] (ExecutionState.java:130) - Parameter failOnError: true
      3. 2009-11-27 08:36:38,652 INFO [job BuildCube] (ExecutionState.java:125) - Starting execution of job BuildCube
      4. 2009-11-27 08:36:38,677 INFO [job BuildCube] (CubeLoad.java:454) - Starting load LSalesCube of Cube Sales
      5. 2009-11-27 08:36:38,678 INFO [job BuildCube] (TableTransform.java:147) - Data retrieval from transform TSales
      6. 2009-11-27 08:36:38,678 INFO [job BuildCube] (TableTransform.java:147) - Data retrieval from transform TSalesCube
      7. 2009-11-27 08:36:38,679 INFO [job BuildCube] (TableSource.java:208) - Data retrieval from extract ExtractSales
      8. 2009-11-27 08:36:48,445 INFO [job BuildCube] (TableSource.java:208) - Data retrieval from extract ExtractTerritories
      9. 2009-11-27 08:36:48,557 INFO [job BuildCube] (TableSource.java:208) - Data retrieval from extract ExtractCustName
      10. 2009-11-27 08:38:36,331 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 10000
      11. 2009-11-27 08:40:44,752 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 20000
      12. 2009-11-27 08:45:30,286 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 30000
      13. 2009-11-27 08:46:41,720 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 40000
      14. 2009-11-27 08:47:01,853 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 50000
      15. 2009-11-27 08:48:31,315 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 60000
      16. 2009-11-27 08:52:20,493 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 70000
      17. 2009-11-27 08:55:24,627 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 80000
      18. 2009-11-27 08:55:39,632 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 90000
      19. 2009-11-27 08:57:00,166 INFO [job BuildCube] (CubeLoad.java:118) - Data vectors loaded: 100000
      Display All


      (killed at this point.. after 20 minutes!)

      So I thought it was the lookups.. so i removed the field transformations.. but its still just as slow.

      I will continue to track it down.. but is there a way to get more verbose debug output?

      -Bob

      The post was edited 1 time, last by blabj ().

    • RE: [Solved] HUGE performance degradation

      very simple mistake at normalization TableTransform.. join wasn't at lowest level of one dimension.. and it didn't complain! It just proceeded to make a massively horrendous join.

      My fault - yes.. but "test" of the TableTransform should give warning if your target dimensions aren't linked to fields at lowest level in dimension.

      -Bob