Resources requested from Palo

This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

  • Resources requested from Palo

    Dear guys,

    I've just tested Palo 1.0 and 1.5 with a small cube of data and it works well.
    But when I've tried to manage a lot of records (for example a cube with 6 dimensions, with 2 large dimensions (more than 2000 elements) I've encountered a lot of problems with cpu and ram, the processes was consequntly blocked and I had to restart my notebook (turion64, 1GB ram).
    I've tried other olap application (like PowerOlap, Applix Tm1, etc.) and I've encounter no problem like this. With these olap applications (not open source) I can manage quickly thousands of data, whilst with Palo I've to wait for a long time (sometimes 20-30 minutes).
    It's normal?
    It's a problem of engine or api or dll?
    What resources are requested from Palo, in order to manage easier and quicker a large amount of data?
    Please, let me know, because I'm very interested in using and developping Palo.

    Thanks.
  • RE: Resources requested from Palo

    I've tried it. From smaller and more sparse dimensions to bigger and more dense dimensions.
    I put the same order of dimensions in the cube and I've campared the answer time of each different Olap (on the same notebook).
    The difference of time was big.
    When I want to read a thousand of data or if I want to send a thousand of data in the cube Palo works for 10-15 minutes (or more), the other olap for 3-4 seconds. It's a limit of Palo or I'm missing something?

    :)
    Alberto M. Vitulano
  • RE: Resources requested from Palo

    Is this an observation the Jedox/Palo staff can confirm and possibly even explain? If so, I'd rather rebuild at least one of my databases from scratch.

    Does the actual sequence in the cube definition matter or is is mainly the sequence in which the dimensions have been introduced to the database?
    Does it matter, if the dimensions are filled from sparse to big one by one or can I still fill the big one first?

    This would be an important design pattern, right?
  • RE: Resources requested from Palo

    Hi amvitulano,
    you mean, you are looking at data import or data export? Indeed I made the experience that this takes quite a lot of time using the import wizard of Excel AddIn. I had a model with over 2 milion records and it took PALO about 1,5 hours to import them with the build-in import wizard. Then I tried Kettle PALO Plugin and it took me 18 minutes, which I think is okay so far. (The version mentioned there is 2.3.1 and not 3.2.1.)

    Greeting from Cologne
    Holger

    The post was edited 3 times, last by h_decker ().

  • RE: Resources requested from Palo

    Hi Irbis,
    I think there is no official deisgn pattern concering the dimension sequence in a cube. I just tried it out. I believe that it depends only on the inner cube sequence of dims not on the sequence they have been introduced to the db. I had the best results JUST looking on number of elements in a dimension, completly ignoring sparsetiy issues.

    Greetings from Cologne
    Holger
  • RE: Resources requested from Palo

    Hello Holger,

    ok, I can even imagine why this matters in case of the usage of the data in the cube, but I would have to look at the code to verify that.

    In the meantime I will give it a try with the trial copy of a cube I have here. 12 dimensions, 2 of them containing about 1000 elements (and of course at the moment at the beginning of the cube) with 2 or 3 levels of consolidation. That sounds like a promising specimen for a test. Fortunately this does not mean I have to recreate the dimensions, as they are rather complex. Number of elements is easy to determine.


    And about your first sentence: well, I think it would be a helpful activity to collect some inofficial design patterns - and even mix them with less Palo specific ones. Many mistakes have been made by each of us and at least I am not free of them today still. I would have donated a candle to the god of BI for every good advice concerning the methodology to create a usable, fast and consistent Palo based data warehouse given to me before I had to learn it the hard way.

    So if there is enough interest out there we certainly could have a thread with wisdom of the form "if you have situation XYZ and you could go driection A or B then do B because later on this will prevent you from C" (C generally equals to some kind of evil redoing of work) ... simle and advanced recommendations alike.
  • RE: Resources requested from Palo

    well, I could test it right now with an import script in Excel which unfortunately does some computations as well, pure import might even show a more drastic result
    [IMG:http://www.mario-wolframm.de/images/speed.JPG]
    Sorry for the german Names of the dimensions (and let's not argue about the number of dimensions). The left column shows the initial sequence, the other two are listing two extreme sortings. I will not reformat the picture in order to have the 65s seconds down in the middle formatted correctly.

    I'm happy with the results [Edit: not with the gain as such but with the proof that design does influence performance]. Query benchmarkings may follow.

    Ciao,
    Mario

    The post was edited 1 time, last by Irbis ().

  • RE: Resources requested from Palo

    Hi h_decker,
    I'm not reffering only to data import/export, but retrieve data from cube as well. Often, when I have to read over 10000 cells from cube, I spend 2-3 minutes (if cpu doesn't block before).
    My consideration are regarding the delay that I meet every time I have to explore data cube with Palo rather than with other Olap.
    My question is: is going to be developped a new faster engine that allow us to read/write data in the cube in few seconds, like other multidimensional olap?
    Thank you in advance.

    :)
    Alberto M. Vitulano
  • RE: Resources requested from Palo

    Good trial Irbis.
    I wolud like to note, in order to develop a better version of Palo, that I've tried, with a 8 dimension cube in Tm1, to import 8500 record in th cube. It takes roughly 7-8 seconds.
    Approximately 10 times faster than Palo.
    Alberto M. Vitulano
  • RE: Resources requested from Palo

    Ok, so I have to take the pure test. Or use Cubeware or Imp:Palo when it's available. It's probably not fair to have Excel (on a Client PC) in the chain of an import. Especially doing some math as well.
    But I'll export and create a dumb import file for a test. And try other ETL tools later on.

    Anyway it's not about which other systems are better because then I would have to relate that to the cost of these competitors. I want to use and support Palo and so I try to make the best out of it. And taking care to the size of a deimension seems to matter. I could use the blunt side of a knife to cut bread but using it correctly may yield slightly better resullts. And for the bread a scalpel won't be necessary, a knife will do.

    Ciao,
    Mario
  • I have two questions:

    Are you using palo 1.5 final release?
    if you compare palo with other Software, are these olaps installed on a seperate server?

    we use Alea and we have a dual opteron server with 2Gb. For testing Palo we are using a PC with P4 3,0 and 1Gb and no seperate server.

    so compare a client (PC) + server (dual processor machine) with Palo PC client and server on same machine is not fair.
  • Yes, but in your first thread you wrote that you use a laptop, without a server.

    Normal you have a server and a client. so perhaps you could test on a server-client system to see what speed your cube would get on a proper hardware. Or is your plan to use Palo only on Laptops, Single-place-PC`s?

    For speed enhancement there are 2 tips

    1. smallest dimensions first biggest last, as you see in test from Irbis
    2. RAM as much as possible on machine which is doing calculation, normaly the server. OLAP´s are memeory-based. If the memory is to small the computer has to create a swap-file and this is a factor of 1000!!!

    So if you plan a server 4Gb are better than 2Gb.

    perhaps this helps.

    PS: Perhaps you could install Palo on two computer one as server and one as client
  • RE: Resources requested from Palo

    Originally posted by amvitulano
    Hi h_decker,
    I'm not reffering only to data import/export, but retrieve data from cube as well. Often, when I have to read over 10000 cells from cube, I spend 2-3 minutes (if cpu doesn't block before).
    My consideration are regarding the delay that I meet every time I have to explore data cube with Palo rather than with other Olap.
    My question is: is going to be developped a new faster engine that allow us to read/write data in the cube in few seconds, like other multidimensional olap?
    Thank you in advance.

    :)


    Hi,
    as they say in the Roadmap its planned to give PALO 2.0 another performance boost.

    Greetings from Cologne
    Holger
  • Ok, for the testbed, as there are questions about the absolute performance values:

    Client WinXp Notebook, 1,8GHz, 1GB Ram
    Network: 100MBit/s switched, not much loaded
    Server: 1 Virtual Machine on a 3GHz 2 processor system with 512MB Ram assigned, the host has 4GB

    I know this server does not fit ideally but I would not do any tests on the productive system. This is our test and acceptance platform and the cube mentioned above also perfoms quite satisfactorily with about 120K records without swapping! This would probably not have been the case with Palo 1.0
    The advantage of the VM is that I can easily start with a clean system.

    I did all three tests on the same basis so they can be compared. But it does not compare to any productive setup.

    The post was edited 1 time, last by Irbis ().

  • @irbis are you usind the final 1.5 version with last speed enhancment?

    i think during importing the memory is not so important as during working with the cube, where the server has to calculate summs and other functions.

    the reason why you got such a better performance with second order of dimension (smallest first, biggest last) is that a olap is looking in first dimension, take all positive matches, than in second take all positiv matches and so on. so your second set is optimized for olap server.
  • Yes, I use the build 1.5.0.39 as client and server setup 1.5.3.21

    As announced, I will do some comprehensive query tests whenever I have the time for that. They would probably be more difficult to perform due to caching mechanisms in the client (?)

    I wonder when the server does really calculate a sum and when it does prefetch them? Sometimes I get the feeling that the "overall sum" simply appears to quickly to be calculated when requested. And if there is a partial precalculation, can you tune this?

    About your last explanation, I am aware that this effect has to appear in every sensible implementation of a sparse matrix. I would rather like to replace the word "olap" with "palo" in this paragraph as other implementations might not obey the dimension order I used when defining the cube but choose a dfifferent sequence internally. So I expected the effect but now also got a magnitude of it's influence and we may derive a hint about how the jedox people implemeted the cubes. Thanks anyway!

    Ciao,
    Mario
  • Second first. We are using Alea and there is the same about hte order of dimension and speed.

    if you load palo from the server to the client, the server calculate all values with formulars, thats the reason why more memory is more speed. than the server hold all values, origin and calculated in memeory and you will see no speed difference between both kind of figures.

    so if you change one figure the complete cube will be recalculated, like in a very, very big ecxel sheet. if the complete cube is in the memory it´s faster than swaping the cube.