How-to ensure content integrity
One common problem experienced by developers is that modifications made to the definitions might make all previous content invalid. Nodetype definition modifications should be made cautiously as unwary modifications could lead to corruption in the JCR repository where the content integrity is no longer ensured. More importantly, this lack of integrity could exist silently until detected, e.g. when importing a site.
Jahia provides a best practice guide that provides recommendations to keep in mind when modifying content definitions. Below is a table that provides comments on the different types of operations:
Type of modification | Operation | Comment |
---|---|---|
Namespace | Creation | Will not create problem |
Namespace | Deletion | Should never be done. Instead of a deletion, stop using the previous namespace |
Namespace | Modification | Should never be done. Instead of a modification, create a new namespace and stop using the previous one. Prefix or URL created before should never be reused. |
Node type | Creation | Will not create problem |
Node type | Deletion | Should never be done before first deleting all the instantiated nodes using this type from templates/sites. Nodes using the nodetype can be found and deleted using Jahia Tools/JCR Console. It is also possible to script (groovy) this operation. |
Node type | Modification | Renaming a node type is similar to perform a deletion of the previous node type, and creation of a new one |
Property of a node type | Creation | Will not create problem |
Property of a node type | Deletion | Should never be done before having set the property to “null” on all the instantiated nodes, otherwise it will lead to publication issues. Alternative possibility is to declare this property “hidden”, and cease using it. |
Property of a node type | Modification | Should never be done if there is node instantiated with this property. If necessary, create a new property and refer to “Deletion” section above. |
When modifications are performed, it must come in two steps:
- First, you must prepare your content so it acts accordingly with future definitions
- When your content is ready, then you can make your definition modifications
We will explain 4 types of modifications/deletions you may encounter during development and how to manage these changes with groovy scripts:
- Deletion of a nodetype
- Deletion of a property
- Adding mandatory constraint to a property
- Adding regex constraint to a property
There many more modifications you may encounter, but the goal here is to only show you the common modifications and possible approaches to address them.
Before proceeding, if you are not familiar with groovy scripts, there are two ways to execute them:
- execute them directly from the Tools :
http://localhost:8080/modules/tools/groovyConsole.jsp - place your scripts in the folder /JahiaFolder/digital-factory-data/patches/groovy. If your platform is already running, the script will be executed automatically, otherwise it will be executed at startup, when the JCR context is ready
Practical cases
Deletion of a nodetype
This modification could lead to an issue during a site export and will lead to issue blocking import process on another environment, as this one will not know how to handle this nodetype.
This case is easy to fix, before removing a nodetype from your definitions, you need to query all the instantiated content nodes of this nodetype, and delete them.
The following groovy script will do the trick:
import org.jahia.api.Constants import org.jahia.tools.patches.LoggerWrapper import org.jahia.services.content.JCRCallback import org.jahia.services.content.JCRNodeWrapper import org.jahia.services.content.JCRSessionWrapper import org.jahia.services.content.JCRTemplate import javax.jcr.NodeIterator import javax.jcr.RepositoryException import javax.jcr.query.Query /** * Remove a nodetype */ final LoggerWrapper logger = log final String nodeTypeName = "ins:myComponent" JCRTemplate.getInstance().doExecuteWithSystemSession(null, Constants.EDIT_WORKSPACE, new JCRCallback() { @Override Object doInJCR(JCRSessionWrapper session) throws RepositoryException { final String stmt = "SELECT * FROM [" + nodeTypeName + "] WHERE ISDESCENDANTNODE('/sites')" final NodeIterator iteratorSites = session.getWorkspace().getQueryManager().createQuery(stmt, Query .JCR_SQL2) .execute().getNodes() while (iteratorSites.hasNext()) { JCRNodeWrapper node = iteratorSites.nextNode() as JCRNodeWrapper node.remove() } session.save() return null } })
Deletion of a property
If a property needs to be deleted, it would be wise to delete this property on existing content. Otherwise the export procedures will raise errors, even though these errors will not block the export or import process. This would also raise errors while trying to publish this content.
The following groovy script is simply querying for nodes which have this property before deleting it:
import org.jahia.api.Constants import org.jahia.services.content.JCRPropertyWrapper import org.jahia.tools.patches.LoggerWrapper import org.jahia.services.content.JCRCallback import org.jahia.services.content.JCRNodeWrapper import org.jahia.services.content.JCRSessionWrapper import org.jahia.services.content.JCRTemplate import javax.jcr.NodeIterator import javax.jcr.RepositoryException import javax.jcr.query.Query /** * Remove a property */ final LoggerWrapper logger = log final String nodeTypeName = "ins:myComponent" final String propertyName = "propertyStringWithI18NToRemove" JCRTemplate.getInstance().doExecuteWithSystemSession(null, Constants.EDIT_WORKSPACE, new JCRCallback() { @Override Object doInJCR(JCRSessionWrapper session) throws RepositoryException { final String stmt = "SELECT * FROM [" + nodeTypeName + "] WHERE ISDESCENDANTNODE('/sites') AND [" + propertyName + "] IS NOT NULL" final NodeIterator iteratorSites = session.getWorkspace().getQueryManager().createQuery(stmt, Query .JCR_SQL2) .execute().getNodes() while (iteratorSites.hasNext()) { JCRNodeWrapper node = iteratorSites.nextNode() as JCRNodeWrapper JCRPropertyWrapper property = node.getProperty(propertyName) property.remove() } session.save() return null } })
Adding mandatory constraint to a property
If existing content is empty on this mandatory property, you will need to know how to proceed by choosing whether to:
- Delete all nodes missing this property and have the nodes recreated by the content editors
- Add a default value on nodes with this empty property, they will have to be properly filled out later
- List all nodes that require treatment and allow the content authors to manually enter the value for this property before proceeding with the definition modification
The following script will add a default value as well as list the impacted nodes:
import org.jahia.api.Constants import org.jahia.tools.patches.LoggerWrapper import org.jahia.services.content.JCRCallback import org.jahia.services.content.JCRNodeWrapper import org.jahia.services.content.JCRSessionWrapper import org.jahia.services.content.JCRTemplate import javax.jcr.NodeIterator import javax.jcr.RepositoryException import javax.jcr.query.Query /** * Property which will become mandatory */ final LoggerWrapper logger = log final String nodeTypeName = "ins:myComponent" final String propertyName = "propertyWhichWillBecomeMandatory" final String defaultValue = "Default value : please replace me" JCRTemplate.getInstance().doExecuteWithSystemSession(null, Constants.EDIT_WORKSPACE, new JCRCallback() { @Override Object doInJCR(JCRSessionWrapper session) throws RepositoryException { final String stmt = "SELECT * FROM [" + nodeTypeName + "] WHERE ISDESCENDANTNODE('/sites') AND [" + propertyName + "] IS NULL" final NodeIterator iteratorSites = session.getWorkspace().getQueryManager().createQuery(stmt, Query .JCR_SQL2) .execute().getNodes() while (iteratorSites.hasNext()) { JCRNodeWrapper node = iteratorSites.nextNode() as JCRNodeWrapper logger.info("Node missing mandatory '" + propertyName +"' property : " + node.getPath()) node.setProperty(propertyName, defaultValue) } session.save() return null } })
Adding regex constraint to property
It is impossible to solve this directly, the best way here seems to list all the content nodes that do not fulfil the future regex constraint, so that it is possible to edit them before the definition modification.
The following script, more complex, will list nodes using the nodetype and will check the value constraint. This script needs to be executed on a development environment where the constraint has already been added :
import org.apache.jackrabbit.core.value.InternalValue import org.apache.jackrabbit.spi.commons.nodetype.constraint.ValueConstraint import org.apache.jackrabbit.spi.commons.value.QValueValue import org.jahia.api.Constants import org.jahia.services.content.nodetypes.ExtendedPropertyDefinition import org.jahia.tools.patches.LoggerWrapper import org.jahia.services.content.JCRCallback import org.jahia.services.content.JCRNodeWrapper import org.jahia.services.content.JCRSessionWrapper import org.jahia.services.content.JCRTemplate import javax.jcr.NodeIterator import javax.jcr.PropertyType import javax.jcr.RepositoryException import javax.jcr.Value import javax.jcr.nodetype.ConstraintViolationException import javax.jcr.query.Query /** * Property on which we will add a constraint (Regex, range) */ final LoggerWrapper logger = log final String nodeTypeName = "ins:myComponent" final String propertyName = "propertyWhichWillHaveARegexConstraint" JCRTemplate.getInstance().doExecuteWithSystemSession(null, Constants.EDIT_WORKSPACE, new JCRCallback() { @Override Object doInJCR(JCRSessionWrapper session) throws RepositoryException { final String stmt = "SELECT * FROM [" + nodeTypeName + "] WHERE ISDESCENDANTNODE('/sites')" final NodeIterator iteratorSites = session.getWorkspace().getQueryManager().createQuery(stmt, Query .JCR_SQL2) .execute().getNodes() ExtendedPropertyDefinition propertyDefinition = null ValueConstraint[] constraints = null while (iteratorSites.hasNext()) { JCRNodeWrapper node = iteratorSites.nextNode() as JCRNodeWrapper if (propertyDefinition == null) { propertyDefinition = node.getApplicablePropertyDefinition(propertyName) constraints = propertyDefinition.getValueConstraintObjects() } InternalValue[] internalValues = null // Retrieve value or values if (!propertyDefinition.isMultiple()) { Value value = node.getProperty(propertyName).getValue() InternalValue internalValue = null if (value.getType() != PropertyType.BINARY && !((value.getType() == PropertyType.PATH || value.getType() == PropertyType.NAME) && !(value instanceof QValueValue))) { internalValue = InternalValue.create(value, null, null) } if (internalValue != null) { internalValues = new InternalValue[1] internalValues[0] = internalValue } } else { Value[] values = node.getProperty(propertyName).getValues() List<InternalValue> list = new ArrayList<InternalValue>() for (Value value : values) { if (value != null) { // perform type conversion as necessary and create InternalValue // from (converted) Value InternalValue internalValue = null if (value.getType() != PropertyType.BINARY && !((value.getType() == PropertyType.PATH || value.getType() == PropertyType.NAME) && !(value instanceof QValueValue))) { internalValue = InternalValue.create(value, null, null) } list.add(internalValue) } } if (!list.isEmpty()) { internalValues = list.toArray(new InternalValue[list.size()]) } } // Check constraints if (internalValues != null && internalValues.length > 0) { for (InternalValue iValue : internalValues) { // constraints are OR-ed together boolean satisfied = false; for (ValueConstraint constraint : constraints) { try { constraint.check(iValue) satisfied = true break } catch (ConstraintViolationException e) { logger.info("ConstraintViolation on node : " + node.getPath() + " | " + e.message) break } } if (!satisfied) { break } } } } return null; } })
Repository desynchronization
If you are doing modifications on content which is already published, make sure to perform the changes on both repositories: live and default. Otherwise, you will create a desynchronisation that could block any publication of this content.
If you encounter a synchronization issue between workspaces, it is possible to fix it by exporting only the staging site, reimporting it, and publishing it. This fix can only be used if you have no user generated content (content generated only in the live workspace such as forum posts or comments).
N.B.: If you do so, it might be complicated to know if a content was previously unpublished, you could accidentally publish a content still in a draft state
Going further
Manipulation of areas
Another kind of integrity issue you might encounter is related to template areas. The name of an area is used to determine the path of the contentList and the nodes to be stored under it. If the name is changed in the JSP, the created content will not be automatically moved to the new expected path but will continue to exists under the old path. As the template is not expecting a contentList with this name, the content will not be rendered (making it hidden) and editors will not be able to edit it.
User or scripted intervention is required to fix this issue by either removing the now unreachable and hidden content or renaming the node name to match the new area name as specified in the JSP.
This content will usually not generate issues during export or import operations but it could lead to issues if you decide to modify definitions of children nodes of these hidden nodes.
Furthermore, it is important to consider renaming the node name to match the new path name or revert to the original path name in the JSP used by the template. This will prevent editors from thinking that the content is missing and that they need to recreate it.
Renaming an area
To make your content visible again, you have to rename all contentlist impacted with the new area name :
import org.jahia.api.Constants import org.jahia.tools.patches.LoggerWrapper import org.jahia.services.content.JCRCallback import org.jahia.services.content.JCRNodeWrapper import org.jahia.services.content.JCRSessionWrapper import org.jahia.services.content.JCRTemplate import javax.jcr.NodeIterator import javax.jcr.RepositoryException import javax.jcr.query.Query /** * An area has been renamed */ final LoggerWrapper logger = log final String nodeTypeName = "jnt:contentList" final String siteKey = "mySiteKey" final String oldName = "areawhichwillchangename" final String newName = "arearenamed" JCRTemplate.getInstance().doExecuteWithSystemSession(null, Constants.EDIT_WORKSPACE, new JCRCallback() { @Override Object doInJCR(JCRSessionWrapper session) throws RepositoryException { final String stmt = "SELECT * FROM [" + nodeTypeName + "] WHERE ISDESCENDANTNODE('/sites/" + siteKey + "') " + "AND" + " NAME(['"+ nodeTypeName + "'])='"+ oldName +"'" final NodeIterator iteratorSites = session.getWorkspace().getQueryManager().createQuery(stmt, Query .JCR_SQL2) .execute().getNodes() while (iteratorSites.hasNext()) { JCRNodeWrapper node = iteratorSites.nextNode() as JCRNodeWrapper node.rename(newName); } session.save() return null } })
Removing an area
This case is easier to handle, you only have to delete the previous areas :
import org.jahia.api.Constants import org.jahia.tools.patches.LoggerWrapper import org.jahia.services.content.JCRCallback import org.jahia.services.content.JCRNodeWrapper import org.jahia.services.content.JCRSessionWrapper import org.jahia.services.content.JCRTemplate import javax.jcr.NodeIterator import javax.jcr.RepositoryException import javax.jcr.query.Query /** * An area has been deleted */ final LoggerWrapper logger = log final String nodeTypeName = "jnt:contentList" final String siteKey = "mySiteKey" final String nameOfTheArea = "areatodelete" JCRTemplate.getInstance().doExecuteWithSystemSession(null, Constants.EDIT_WORKSPACE, new JCRCallback() { @Override Object doInJCR(JCRSessionWrapper session) throws RepositoryException { final String stmt = "SELECT * FROM [" + nodeTypeName + "] WHERE ISDESCENDANTNODE('/sites/" + siteKey + "') " + "AND" + " NAME(['"+ nodeTypeName + "'])='"+ nameOfTheArea +"'" final NodeIterator iteratorSites = session.getWorkspace().getQueryManager().createQuery(stmt, Query .JCR_SQL2) .execute().getNodes() while (iteratorSites.hasNext()) { JCRNodeWrapper node = iteratorSites.nextNode() as JCRNodeWrapper node.remove(); } session.save() return null } })
Checking the content integrity of a website
If you want to check the integrity of your nodes against their definitions, before performing an export for example, we began to develop an unofficial module fulfilling this need :
- Link to our public Appstore : https://store.jahia.com/contents/modules-repository/org/jahia/modules/verify-integrity.html
- Github repository : https://github.com/jordannroussel/verify-integrity
For the moment, not every possible issues are being detected by this module, yet we are improving it from time to time.
If you want to add improvements, or report issues, feel free to do it on the Github repository
N.B.: this module is not officially supported by Jahia