Improving Build Time for 3DSpaceIndex

By Jerry Luczyk and Mark Wenske, Infrastructure Architects for Dassault Systèmes

Multiple attendees at this past year’s COExperience brought up slow running index builds as a problem with their customers during the infrastructure roundtable. In a fortuitous parallel event, Jerry Luczyk wrote a detailed white paper requested by a customer addressing the topic. Most of the material for this article comes from the white paper.

There are many paths to address to improve build performance. Before you consider parallel index build configuration, the first question to ask is:

Does your system have enough data to have a noticeable gain in build speed compared with the effort of setting up parallel crawling? If you only have 1 million Business Objects, probably not. If you’re approaching 10 million Business Objects, then it might.

The information in this article assumes you are dealing with a large data set and wish to improve your index build performance. Much of the following technical guidance assumes a working knowledge of 3DEXP platform deployment and maintenance.

The basics of improving Full Index Building performance generally include:

• Fast Hardware and disk followed by sizing the components

• Parallel MQL crawling and other CloudView tweaks

• Ensuring your data is clean

• Multiple FCS servers with converter installed if you have a lot of files

• Tweaking various DB, OS, crawler and Index parameters to optimize throughput

Dassault Systèmes has performed parallel index builds for multiple customers and your results will vary. All customer data is unique so examples of what works for one customer, and the gains they achieved, will not be identical to another customer.

For the first bullet point about hardware and disk, realize compute infrastructure has the largest impact on index build performance and must be taken into account when tuning and optimizing 3DSpaceIndex. The goal is to ensure that, during a full build, no server component ever becomes saturated (>90% CPU and/or RAM swapping). If running in a Virtual environment, you also want to make sure, if possible, no VMs are running on hosts that have over-committed virtual resources. Now that you have enough hardware, you may need to size the components. Examples would be:

• MQL crawler enovia.ini/mxEnv.sh settings for kernel and Java memory

• FCS settings for maximum file size, JVM settings and potentially redirecting all FCS calls to Central FCS (if you’ve replicated all remote FCS data to a Central site)

• Converter settings (usually not necessary)

• 3DSpaceIndex DeploymentInternal.xml settings for various CloudView JVMs

Switching to data cleanliness, ensure your data and index data are up-to-date by running “tidy vault”. Also rebuild stats and indexes as they may be out of date post tidy. You should also update ACLs prior to the crawling process using the “update accesslist on temp query bus * * * size 500;” mql command. This removes a potential bottleneck when using parallel indexing since the ACLs can’t currently be computed in a parallel fashion.

A corollary to stats and indexes would be to configure database monitoring (like Oracle AWR). The goal here is to ensure your database is not experiencing any issues during DB crawling. We have seen cases where some DB indexes needed to be disabled to during the build to improve crawler performance.

At an architecture level, ensure each FCS is on its own VM or host since you may only have one file converter per operating system.

Lastly parallel crawling. Configure this in the config.xml via the <DATASETS concurrency="4"…> parameter. As a reminder, you need enough data in order to make parallel crawling useful. It may also be useful to utilize some temporary CloudView tweaks such as disabling real-time aggregation (in order to let the CloudView application focus on building the index rather than aggregating data).

To conclude, execute the build! If you need further details or help, please reach out to your Dassault Systèmes' services contact. Happy indexing!

Back to Content Center