Email threading is a technology that automatically identifies connected emails in a chain to produce one or more “Inclusive” emails. Inclusive emails are generally the end of the chains and include all prior “Lesser” or duplicate emails in the preceding text. There can be more than one inclusive email where a chain diverges or where new attachments are included.
It is increasingly common to use email threading to reduce the burden and expense of reviewing and producing multiple copies of different stages of the same email conversation. Consolidating the communications also helps to ensure consistency of redactions, reducing the risk of inadvertent disclosure. However, the use of threading can backfire significantly if the technology is not fully understood.
Though the UK rules on disclosure do not yet compel specific actions regarding threading, judgments in the US, which are often our litmus test for areas of new technology, have involved parties being required to undertake costly resolution where their use of threading has excluded relevant data that should have been disclosed.
In this blogpost we examine some challenges with email threading and provide suggestions as how best to overcome them.
1. Impact on searching
If review involves a specific date range, custodian, or email participant, email threading should be the last step taken prior to review. Email chains can span years with participants changing frequently. Once threaded, the email search filters such as “From” and “To” will only represent the end of those chains. This can be unhelpful when needing to undertake ad hoc fact finding or investigation.
Your eDiscovery supplier should provide bespoke searchable fields marking the start and end of the entire chain, as well as all participants involved. Email threading generates a unique identifier for every chain making it possible to obtain and propagate information from and across all connected items. This will help to ease the challenge of locating individuals or time spans within a reduced document set. Your eDiscovery supplier should also advise that threading should be re-applied in full, if any additional data is added. This can add costs in circumstances where data collection is ongoing and so it is worth considering if threading can wait.
2. Recipient concerns
In some review platforms, the identification of Inclusive emails does not consider email recipients as part of the calculation. Where emails are sent separately to different groups of recipients, this means that there is no control over which of those groupings will be marked as the Inclusive email. The others will be marked as Lesser or duplicates. Review teams may not see the full list of recipients, and this can have implications for decisions regarding privilege.
Your legal team will need to know how the chosen eDiscovery software applies threading and where bespoke options are available regarding recipients. If all email participants are strategically important to the case, it is worth considering the use of threading only as a visual review aid rather than a means of document reduction.
Care will also need to be taken to ensure opposing parties are aware of any limitations.
3. Predictive or Technology Assisted Review (TAR) considerations
As TAR is increasingly used in a flexible manner to better fit the case at hand, some review teams prefer to work outside of the traditional queue. Working this way allows review teams to make use of the predictive model scores whilst benefiting from a bespoke approach to how data is reviewed. However, working outside of TAR’s defined workflow can impact the precision, recall, and elusion scores so often needed for defensibility.
If the predictive model is restricted to the same core data set as the reviewers, including limitations to Inclusive only emails, then the challenge with metrics is resolved. However, Inclusive documents are likely to contain a larger volume of text. This creates additional challenges as documents with excessive text content will not be scored by the model. If it can be scored, there is a risk that relevance is buried amidst the irrelevant content from the wider email chain. Machine learning may then miss this or take longer to establish the necessary pattern which can increase the time and cost you thought you were saving through email threading.
Your legal team will need to understand how the chosen eDiscovery software measures the success of TAR in addition to appreciating the general workflow and limitations of machine learning.
Where disclosure excludes lesser emails, later challenges can arise with witness statements. Documents may be required for reference, only for the legal team to realise that they exist in disclosed form, only as part of a later email chain. The witness may not even have been involved in the discussion at the time of the disclosed document. This leaves decisions to be made on whether document redactions are required, or supplemental disclosures. Both options add to the cost as well as the overall case timetable.
Mitigating this requires a fine balance. If the review team can identify any key discussions for instance, those lesser emails can receive coding at the same time. This will have an impact on the pace of review, and it requires the review team to have a strong sense of what is likely to be important. This is therefore usually best left to those closest to the case.
Another option is to identify where key witnesses appear in the wider thread of emails but do not appear in the inclusive emails. This involves making use of those custom options discussed earlier regarding thread participants. The workflow chosen will depend entirely on the circumstances, but the key message is not to delay these considerations until after disclosure.
In summary, email threading represents one of many technical processes which should be used to actively manage increasing data volumes. At the same time, like other legal technologies, it should not be used without careful consideration from both a technical and strategic viewpoint of the case beyond disclosure.