Start a conversation

ETL Jobs Failing: 'unique constraint (JIVE_ANA.JIVEDW_OS_STG_PK) violated' or 'DocumentStatusETL java.sql.BatchUpdateException: Batch entry

Overview

ETL jobs have started failing, with one of the following error messages appearing in the logs:

  • ERROR com.jivesoftware.community.analytics.etl.IdDrivenCopyETLTask - Error running ETL: DocumentStatusETL
    java.sql.BatchUpdateException: Batch entry
  • ERROR com.jivesoftware.community.analytics.etl.impl.ETLTaskExecutorImpl - Error executing tasks: ORA-00001: unique constraint (JIVE_ANA.JIVEDW_OS_STG_PK) violated
  • ERROR com.jivesoftware.community.analytics.etl.IdDrivenCopyETLTask - org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "jivedw_os_stg_pk"

This article describes the solution for this issue, as well as a workaround.

 

Environment

Jive Hosted and On-Premise versions prior to 9.0.6.

 

Solution

This issue occurs when a unique index on the jiveDocVersion table is violated. To permanently resolve this issue, please upgrade to the latest version of Jive.

A temporary workaround can be applied by following these steps:

  1. Make a backup of the application database.
  2. Run the following queries on the application database (adjusting them based on your application database type):
    1. Query-1

      SELECT v1.internalDocID AS id, v1.creationDate, COUNT(*)
      FROM (SELECT internalDocID, MIN(versionId) AS min_v, MAX(versionId) AS max_v, creationDate, modificationDate
      FROM jiveDocVersion WHERE status = 2 AND modificationDate > 0 AND internalDocID >= 0 AND internalDocID <= 10000000
      GROUP BY internalDocID, creationDate, modificationDate) v1
      LEFT JOIN (SELECT internalDocID, MAX(versionId) AS max_v, modificationDate
      FROM jiveDocVersion WHERE status = 11 GROUP BY internalDocID, modificationDate) v2
      ON(v1.internalDocID = v2.internalDocID)
      GROUP BY id, creationDate
      HAVING COUNT(*) > 1
    2. Query-2
      SELECT v1.internalDocID AS id, COALESCE(v2.modificationDate, -1), COUNT(*)
      FROM (SELECT internalDocID, MIN(versionId) AS min_v, MAX(versionId) AS max_v, creationDate, modificationDate
      FROM jiveDocVersion WHERE status = 2 AND modificationDate > 0 AND internalDocID >= 0 AND internalDocID <= 10000000
      GROUP BY internalDocID, creationDate, modificationDate) v1
      LEFT JOIN (SELECT internalDocID, MAX(versionId) AS max_v, modificationDate
      FROM jiveDocVersion WHERE status = 11 GROUP BY internalDocID, modificationDate) v2
      ON(v1.internalDocID = v2.internalDocID)
      GROUP BY id, v2.modificationDate
      HAVING COUNT(*) > 1
  3. Identify which ids have multiple rows.
  4. For each id that has multiple rows, follow the below guidance:
    1. Where all results are from the same userid: Leave the result with the latest modification date and delete the rest.
    2. When results return multiple statuses: Leave the result with the latest modification date and delete the rest. Alternatively, you can also delete the oldest of duplicate entries for the same status.
    3. If the results return multiple userids: Treat each userid separately using the treatment from the two bullets above
  5. Start the ETL (Extract, Transform, and Load) job again. 

 

<supportagent>

Support agents: For on-premise instances, Step 4 should only be done by the customer.

</supportagent>

Choose files or drag and drop files
Was this article helpful?
Yes
No
  1. Priyanka Bhotika

  2. Posted

Comments