Uploaded image for project: 'EJBCA'
  1. EJBCA
  2. ECA-10215

Database interruption during publishing can cause certificates to be lost

    Details

    • Issue discovered during:
      Customer
    • Sprint:
      EJBCA Team Alice - 2021 w37

      Description

      If the database connection is interrupted or stalled while a certificate is being publishing, then the following may occur:

      • Transaction is rolled back (certificate is not added in the CAs database)
      • Non-transactional operations invoked during certificate creation are not rolled back. E.g. Certificate may still have been published and/or submission to CT logs may occur even though the final certificate isn't issued.

      During certificate creation all related operations such as end entity creation / edit, persisting the certificate and adding it to the publishing queue etc. runs in the same transaction. Intentionally there are a couple of exceptions to this, including audit logging, direct publishing and persisting CT pre-certificates when insufficient number of SCTs are returned (in order to rollback everything except the pre-cert).

      Generally speaking, this prevents internal data inconsistency upon failure. However, it isn't necessarily the best approach in scenarios where "external" publishing occurs during the same transaction since it cannot be rolled back with the rest of the transaction.

       

      Possible scenarios

      Publishing:

      1. Certificate is issued and the related events such as access control, end entity creation and certificate creation are logged.
      2. Certificate and end entity changes are staged for database persistence.
      3. Non-transactional Publishing occurs.
      4. Before the transaction can be committed, there is a database interruption.
      5. Eventually the transaction will timeout and roll back changes. However, the certificate is already published.

      Likewise for CT submission:

      1. Pre-Certificate is issued and submitted to CT logs.
      2. Assuming sufficient SCTs are returned, the certificate is issued and the related events such as access control, end entity creation and certificate creation are logged.
      3. Certificate and end entity changes are staged for database persistence.
      4. Before the transaction can be committed, there is a database interruption.
      5. Eventually the transaction will timeout and roll back changes.

      If CT submission fails e.g. due to insufficient SCTs, EJBCA will store the pre-certificate is an isolated transaction in order to avoid rollback. However, if database writing fails in the small time window in between a successful CT submission and transaction committing, this might leave the pre-certificate published to CT logs without persisting the certificate. in the database.

       

       

      Workaround:

      Disable direct publishing, and use the publisher queue with a long enough executing interval instead. Certificates that are missing can be recovered from the audit or server log, which is not rolled back.

       

      Steps to reproduce:

      1. Add a delay in the code at the beginning of SignSessionBean.postCreateCertificate:
        try { Thread.sleep(30_000); } catch (InterruptedException e) { Thread.currentThread().interrupt(); }
      1. Re-build and re-deploy:
        ant clean deployear
      2. Issue a certificate from the RA web. This will hang because of the delay
      3. Now stop the database

      Result: The certificate is created (postCreateCertificate is reached), but the certificate is missing in the database, and cannot be found on the "Search" -> "Certificates" page.

      (Don't forget to remove the delay in step 1 when you are done testing)

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              samuel Samuel Lidén Borell
              Reporter:
              samuel Samuel Lidén Borell
              Verified by:
              Henrik Sunmark
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: