Technology and the changing practice of law: An

entrée to previously inaccessible information via

TRAC

 

LINDA ROBERGE1, SUSAN LONG1, PATRICIA HASSETT2 and DAVID

BURNHAM3

1TRAC School of Management, Syracuse University, USA

E-mail: lroberge@syr.edu; suelong@syr.edu

2College of Law, Syracuse University, USA

E-mail: phassett@law.syr.edu

3S. I. Newhouse School of Communications, Syracuse University, USA

E-mail: burnham@epic.org

 

 

Abstract. The proliferation of electronic databases is raising some important questions about how

the evolving access to new or previously inaccessible information is likely to change the practice

of law. This paper discusses TRAC, an interesting electronic source of previously inaccessible information

that is currently used by members of the media, public interest groups, lawyers, and the

federal government. Summaries, reports, and snapshots of TRAC’s data can be accessed through a

series of public web sites. TRAC’s subscription service allows users access to the data warehouse

and data mining tools (see http://tracfed.syr.edu/info.html for more information). Additionally the

paper examines how AI can be employed to assist for the legal profession in utilization of TRAC’s

data. Finally, it speculates about how TRAC and other new electronic data sources may impact the

practice of law.

Key words: automated case advisor, data mining, data warehouse, Freedom of Information Act,

statistical information, U.S. Government

 

 

1. Introduction

Suppose for a moment that you are about to appear for the first time in a U.S.

District court. You know the law but you would also like a better picture of how

the system really works. What’s the likelihood that your case will be dismissed?

If not dismissed, how likely is your client to be acquitted? If not acquitted, what

are the typical prison sentences? Who are the Assistant U.S. Attorneys? How good

are they? Are the salaries high enough to keep the good ones there? You may have

all these questions and more, but where would you find such information? Perhaps

your colleagues could relate their experiences, but is there any hard data?

In this paper, we discuss how new information technologies, specifically data

warehouses coupled with data mining tools, can be used to make existing sources

262 LINDA ROBERGE ET AL.

of data available, accessible, and useful to lawyers. In particular we demonstrate

how Transactional Records Access Clearinghouse has created a data warehouse

and designed easy to use data mining tools that now provide a picture of the U.S.

federal system of jurisprudence that has never before been available.

As advanced information technology makes more and more electronic information

accessible to non-technical people, practicing attorneys need to keep an eye

on these developments and to ask how the availability new information will impact

the practice of law.Will the information impose new duties on the lawyer?Will the

lawyer who fails to canvas the Internet for information relevant to a client matter be

guilty of malpractice in the same way that a lawyer’s failure to know the law might

be malpractice? On the other hand, will the information enable the lawyer to pursue

cases that would have been prohibitively expensive because of data inaccessibility?

2. TRAC: The power of information1

Transactional Records Access Clearinghouse, more commonly known by its acronym

TRAC, is a research center at Syracuse University that has developed

an application that can serve as a model for making information available. A

team of professionals with backgrounds in statistics, journalism, government records,

information systems, web development, and statistical programming serve

as TRAC’s primary staff members. In addition, public interest lawyers volunteer

their services to obtain raw data from the federal government via the Freedom of

Information Act. The purpose of TRAC is to provide the American people, as well

as institutions of oversight like Congressional committees, news organizations,

public interest groups and scholars, with the information they need to fairly judge

the performance of the federal government. (See http://tracfed.syr.edu/info.html for

more information.)

Researchers at TRAC achieve the center’s purposes through the creation of

a data warehouse and specially designed data mining tools. Users are able to

access the information via the internet using an ordinary web browser such as

Netscape or Internet Explorer. The data warehouse contains data from many different

sources including the Executive Office for United States Attorneys in the

Justice Department, the Administrative Office of United States Courts, the Of-

fice of Personnel Management, the Internal Revenue Service, the Environmental

Protection Agency, the Census Bureau, and a range of other specialized federal

agencies. Areas covered include criminal enforcement, civil actions, administrative

enforcement by the IRS, federal staffing, records of judges and prosecutors, federal

expenditures and more.

As indicated by the center’s name, TRAC staffers always try to obtain transactional

information about individual matters rather than summary data. Concerning

criminal enforcement, for example, TRAC acquires data from the Executive Office

for United States Attorneys about each referral for prosecution. Data elements

about each referral include where and when the referral was filed, which agency

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 263

made the referral, the lead charge, the initials of the assistant U.S. Attorney handling

the matter, whether it was declined for prosecution and why, whether the

matter resulted in an indictment, the initials of the presiding judge, the outcome,

and if convicted, the sentence. TRAC continually submits FOIA requests to obtain

additional data from new sources.

By accessing the data warehouse, users can see the details about each individual

matter. The real power of the warehouse, however, is in the data mining tools

specifically developed by TRAC programmers using SAS, a powerful analysis software

suite. These tools allow users to generate tables, graphs and maps to answer

a wide variety of questions about how the law is enforced in different parts of the

country and how the enforcement of different laws has changed over time. The list

of potential questions is endless. The ability to find relevant information quickly

and easily gives a picture of the workings of the US federal government that would

be impossible to get by looking at the details of each action individually.

3. Understanding the technology

In TRAC’s application, which can serve as a model for others, the transformation

from data to information and subsequent publication requires three different

information technologies, namely data warehouses, data mining tools, and the internet.

While the vast majority of readers are likely to be familiar with the internet,

data warehouses and the statistical tools that allow users to analyze or “mine” the

data are less well known.

3.1. THE DATA WAREHOUSE

Data warehouses are not the same thing as databases. A database is a collection of

information that records and tracks the individual activities or transactions of an organization.

For example, when a government employee is hired, information about

the employee and his/her job is recorded in a transactional database. As information

about the employee changes (e.g., salary, work schedule, or grade) the database is

updated. The software that manages databases continues to decrease in cost and

increase in user friendliness. As a result, the number of transactional databases

has risen dramatically. When the numbers of databases proliferate, we end up with

both “information overload” as depicted in Alvin Toffler’s Future Shock, and with

isolated collections of data often referred to as “information silos” (Toffler 1970).

The greater the amount of data, the greater the number of silos, and the harder it

becomes to use data related to individual parts of an organization to understand the

organization in its entirety. Information overload and the resulting compartmentalization

of data has been cited as one of the reasons that multiple clues about the

September 11 terrorism attack were never connected (see for example MacDonald

2002).

264 LINDA ROBERGE ET AL.

One purpose of data warehouses is to enable connections among the individual

silos of information that exist in most large organizations including the U.S.

federal government (Jarke 2000). Data warehouses facilitate these connections because

they differ from transactional databases in several significant ways. First,

warehouses consist of one or more transactional databases that have been reengineered

so that they can be integrated. Second, in addition to transactional records,

warehouses may contain summarized data that can also be integrated with the

transactional data. And finally, warehouses contain historical data that is updated

periodically, often quarterly or yearly, rather than “live” data that is constantly

being updated in real-time. “A distinguishing characteristic of warehoused data is

that it is used for decision making rather than for operations” (Ballou 1999).

However, the integration of historical data via data warehouses brings its own

problems. Codes that designate categories may change over the years, the boundaries

of geographic districts may change, new codes may be added, others dropped,

and so on. There are a number of warehousing packages that purport to diminish

these problems. Nevertheless, when compared to database software, warehouse

software is neither inexpensive nor easy to use. Despite the existence of new

warehousing tools, “. . . [the] creation, maintenance, and daily administration of

data warehouses are still formidable tasks that are far from being fully automated”

(Benander 2000).

To build the data warehouse, the team of researchers at TRAC begin by searching

through government manuals, websites and other such sources to identity

relevant systems of records and transactional databases maintained by different

agencies. Based on these leads, TRAC makes requests for specific data sets and

all of the agency documentation describing the details of what is covered and how

the information is organized. The requests usually are made under the Freedom

of Information Act (FOIA). When release of the data entails a lawsuit and court

decision, this beginning step can require a great deal of time and expense.

Once the data sets and documentation are in hand, statistical and other kinds

of checks are made to test the completeness and reliability of the information

that has been provided. For example, have code tables been provided? Are all

codes appearing in the database listed in the code tables? Are there codes in the

tables that don’t appear in the data? Are some fields not used? Are there internal

inconsistencies – e.g., does a case end before it starts? Data from different sources

about similar events may be compared or merged as a further check on data set

reliability. While a few parts of the reliability assessment are generic across data

sources and can be automated, most of the process requires human intervention.

(For a more complete discussion of data quality in a data warehouse environment,

see for example Mallach 2000.) If the overall quality of a data set is poor, it will

not be merged into the data warehouse. On the other hand, if only a limited number

of fields are found to be unreliable, the data may be merged but the questionable

fields will not be used for analyses. When this happens, users are warned about the

shortcomings.

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 265

Most transactional data require augmentation before being loaded into data

warehouses so that data mining tools can be used effectively (Berry 1997). Therefore,

the linking, grouping, and classification variables that will be used to place the

data into geo-political-temporal context must be developed and added to the data.

Additionally, numerous performance criteria are defined, indicators developed, and

these too become part of the warehouse. Together these related sources of data,

called metadata, form a vital part of the data warehouse because they contain the

domain knowledge that makes the data useful (Dyche 2000; Kimbal 2002; Whitten

2000). Because the contextual needs of various users may be quite different

depending on their interests, TRAC attempts to provide as much related metadata

as possible including geography, population, time trends, constant/real dollars, etc.

Recently, TRAC has begun the ambitious task of incorporating new data into its

data warehouse on a monthly basis.

The size of TRAC’s data warehouse is enormous taking up approximately 300

gigabytes of storage space. For example, because there were more than 140,000

criminal referrals for prosecution in a recent year and the online data go back

to 1986, the information in this area alone is extensive. (The Justice Department

recently estimated that they have supplied more than 25 million records to TRAC!)

In addition, TRAC’s enforcement data cover civil matters – where the government

is either the plaintiff or the defendant – and administrative actions by the IRS.

Along with the enforcement data there is information on staffing going back to

1975, federal spending going back to 1993, federal judges going back to 1986, and

federal prosecutors going back to 2001.While TRAC constantly strives to automate

the production processes for adding new data from existing sources, preparing data

from new sources continues to require the skills of highly trained statisticians, data

analysts, and computer programmers in addition to individuals with expertise in

the subject matter.

3.2. THE DATA MINING TOOLS

As daunting as establishing a data warehouse may be, creating information from

the warehouse is yet another formidable task. Given the size of most data warehouses,

finding information amid all the data can be a major problem. This is where

the data mining tools come into play. “The purpose of data mining is to explore the

data warehouse looking for trends, relationships, and outcomes” (Hall 2001). The

objective is to find the patterns that will provide a coherent unified view of the organization,

to discover what is happening within the organization, and to place this

information into a context that will make it understandable and usable for action.

This can only be accomplished by merging information from several databases.

For example, knowing how much money was spent in a particular district will be

much more meaningful if you can integrate population data to enable comparisons

between districts, or constant dollar calculations to enable comparisons over time.

It is the integration of separate databases, including the metadata, that provides the

266 LINDA ROBERGE ET AL.

geo-political-temporal information necessary for interpretation. For integration, we

need tools that allow users to specify what information they want without needing

to understand the complexities behind how the data is organized, combined, and

how the information was generated. (Friedland 1998) While some data mining

tools explore data automatically, other types of tools operate in a semi-automatic

fashion in that they allow users to direct how and where to search for the trends

and relationships (Berry 1997; Witten 2000).

TRAC has developed three different types of semi-automatic data mining tools

that enable users to direct the analysis of the warehoused data. Although experienced

statisticians historically have been the ones to do data analysis, most users

don’t have access to such expertise (Dyche 2000). Thus TRAC’s tools were created

to allow both novice and advanced users to perform whatever data analysis they

need, regardless of their statistical expertise.

The first tool is called Express. As the name implies, this tool allows users to

quickly and easily produce counts, averages, medians, and other specially computed

measures that are used to generate rankings, comparisons, and trends. Users

can specify if they want the information by district, agency, program area, or lead

charge/cause of action. For IRS audits, users can also chose to have the information

produced by income class, selection reason, and auditor type. Additionally, users

are able to indicate whether they want the information returned to them in the form

of tables, graphs, or maps. Because this tool is both powerful and easy to use, it is

frequently the tool of choice for novices and power users alike.

Sometimes users need multidimensional views of the data that facilitate comparisons

across groups, years, organizational entities, etc. Data miners refer to the

this process as “slicing and dicing” (Dyche 2000). To understand how one district

handles health care fraud, for example, users may want to know how many of the

referrals get prosecuted versus how many are declined. How does this compare

with other districts? Has this changed over the years? To provide this capability,

TRAC has developed a second tool called “Going Deeper”. This tool allows users

to focus on a particular stage in the referral process and to generate performance

measures such as percentages, rates relative to the population, and outcomes. Going

Deeper, as the name suggests, provides a drill-down capability that enables users

to produce and view the data as a series of linked tables that focus on increasingly

narrower subsets of data down to a listing of the individual matters, or federal employee.

As with the Express tool, Going Deeper is easy to use via a point and click

interface. Users with a wide variety of experience find that it too is user-friendly

and capable of generating complex information.

The most advanced tool is the “Analyzer”. This tool allows users to specify a

particular slice of data that is of interest to them, and to store their own unique

subset of data in a personal “web locker”. From the web locker, the user can run

numerous types of sophisticated analyses, the results of which can also be stored

in the web locker. As an adjunct to Express and Going Deeper, Analyzer provides

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 267

users with the ability to perform sophisticated analyses on any subset of data in

which they are interested.

Through the use of these tools, users are able to literally “create” information

by entering the data warehouse and analyzing the data contained in the individual

transactional records concerning each matter. Although the tools are easy enough

for beginners to use, they are powerful enough that many statisticians and data

analysts chose to use them over traditional analysis software. Examples of the user

interfaces for these tools are located in Appendix A.

3.3. THE WEB SITES

To provide access to the data warehouse and the data mining tools, TRAC harnesses

the power of the Internet by maintaining two broad categories of web sites. First is

a series of six free public web sites that mostly focus on the criminal enforcement

activities of the Federal Bureau of Investigation, the Immigration and Naturalization

Service, the Drug Enforcement Agency, Bureau of Alcohol Tobacco and

Firearms, the Customs Bureau, and the Internal Revenue Service. (Sites can be

entered from http://trac.syr.edu/). TRAC’s IRS site also includes information about

IRS administrative actions–audits, seizures, levies and liens, etc. Using data from

the data warehouse, the free web sites offer pre-selected, but very extensive, views

of each agency’s enforcement activities, both nationally and within individual

districts, along with graphs, maps, and tables that highlight interesting findings

and trends over time. The free sites also offer special studies on such subjects as

counter-terrorism enforcement and long-term changes in federal staffing. The free

sites do not allow users access to the data mining tools that would allow them to

tailor the information to meet their own unique needs.

TRAC’s second offering consists of a dynamic subscription site that provides

vastly more information as well as access to the data mining tools. (Available at

http://tracfed.syr.edu/) In the criminal area, for example, enforcement data can be

organized by statute, district, Justice Department program category, and by virtually

any agency. In the civil area, extensive information about the civil matters

processed by the U.S. Attorneys where the government is the plaintiff or the defendant

can be organized by cause of action, government role, or court type in

addition to district and agency. Yet another area offers agency-by-agency staffing

information - from statistical overviews by federal judicial district, state, county or

city down to the names and salaries of individual employees. Recent additions

to this area include express tools that allow users to compare and contrast the

records of individual federal prosecutors and district court judges, and to obtain

listings of the individual cases. Federal expenditure data, agency-by-agency and

program-by-program, provide yet another perspective on the government.

TRAC’S uniquely comprehensive collection of information has become an important

stop for those interested in the actual operations of the federal government.

TRAC’s six public sites, for example, are receiving “hits” at a rate of more than 5

268 LINDA ROBERGE ET AL.

million a year. Clients of TRAC’s subscription service now number more than 300

individuals plus an ever-growing number of libraries, congressional committees,

and law firms with site licenses. Additionally, for data that has not yet been suf-

ficiently processed to be included in the data warehouse, clients can commission

special pre-release statistical runs.

4. Lawyers, data, and the TRAC model

Without advanced information technology, seasoned lawyers have relied on their

experience with the workings of the legal system to make decisions about how

to handle a case and what advice to give to their clients. Many law firms allocate

substantial research dollars and staff time to costly efforts aimed at collecting anecdotal

information about the functioning of the system in particular districts and the

professional proclivities of individual judges and prosecutors. Unfortunately, experience

can be misleading. Are defendants really more successful than plaintiffs in

getting adverse trial outcomes reversed on appeal? Does one judge impose longer

sentences than another for similarly situated defendants? How frequently does a

particular prosecutor decline certain types of cases? Before giving advice based

on experience, it would be nice to be able to confirm impressions, hunches, and

anecdotes with data. In fact, as other sources of information become available, the

practicing lawyer needs to be aware that such information is likely to raise the best

practice standards of the profession.

The five scenarios below present situations that lawyers might encounter where

existing data available through TRAC could be helpful.

Scenario 1: It will cost X amount to bring a civil suit against the government.

For this type of case, how often does the court find in favor of the plaintiff? In

what proportion of cases are monetary damages awarded? What is the average

amount of damages awarded? Given the Assistant U.S. Attorney handling the

case, is the settlement likely to be more or less favorable than average?

Scenario 2: You have a pending environmental case and are negotiating with

a specific Assistant US Attorney. You would like to obtain a listing of other

environmental matters handled by this particular prosecutor so that you could

see which cases were declined or dismissed, which went forward, and what

the outcomes were.

Scenario 3: Your client has been notified that the IRS has decided to audit his

return. For this type of audit, what percentage of cases is assessed additional

taxes and penalties? On average, how much additional taxes are owed? If the

IRS offers a compromise, what would constitute a reasonable offer?

Scenario 4: After an investigation, the FBI has referred your client’s case

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 269

to the U.S. Attorney’s office for prosecution. What proportion of the FBI’s

referrals gets declined because there hasn’t been adequate preparation of the

case? If your client is prosecuted, what is the record of this prosecutor in

obtaining jail time for this type of charge? If the prosecutor offers a plea

bargain, should you advise your client to accept?

Scenario 5: You are concerned about defending a racketeering case because

you are afraid it will drag on forever. For your district, how long has it taken to

prosecute other cases with this lead charge? For this prosecutor? For the judge

who will be hearing the case?

5. Enhancing TRAC for lawyers

While TRAC is a fully functional, powerful application used by many outside the

legal profession, it has not yet come close to realizing its full potential for lawyers.

Two major categories of problems currently stand in the way. The first category

relates to problems that are experienced by the developer in adding relevant new

data to the data warehouse, most notably cost. Developing a data warehouse, userfriendly

data mining tools, and delivering the product to consumers is intellectually

challenging, technically demanding, time consuming, and amazingly expensive.

When the extra burden associated with FOIA lawsuits to obtain needed new sources

of data are added in, the cost structure of the application becomes prohibitive. In

short, providing information is a costly business. While vitally important, these

issues are beyond the scope of this paper.

The second category of problems that stand in the way of lawyers realizing

the full potential of TRAC includes a number of related issues that many lawyers

encounter in attempting to use what is, fundamentally, a statistical application.

After all, the qualities that make a good lawyer aren’t necessarily the same ones

that foster statistical thinking. Luckily, this category is more easily ameliorated

than the first.

An important point to consider here is how lawyers differ from other users.

In some respects, lawyers are more informed users. They understand how matters

move through the legal system from the referral stage through final disposition.

They also are more familiar with the laws than other users. In other respects, however,

lawyers differ very little from their counterparts in other professions. They

tend to ask the same types of questions and encounter the same types of problems.

Analyzing the questions received by TRAC’s help desk creates a good picture of

ways in which TRAC can be enhanced to assist lawyers with their data utilization

needs.

270 LINDA ROBERGE ET AL.

5.1. BARRIERS AND USER EXPERIENCE

The types of barriers faced by users can be grouped into three general categories,

technical, informational, and conceptual. The technical barriers relate to the problems

that users may encounter with the use of TRAC’s data mining tools, printing,

or problems that are unrelated to the data. In general these barriers tend to be related

to a user’s “web-savvy-ness”. Users who surf the web with ease experience few

difficulties. For non-surfers, knowing where to click can be a real problem.

The informational barriers are those that relate in some way to the data, the

performance measures, or to the statistics. Examples include meaning of codes,

what is covered, the difference between the mean and the median, etc. While some

of the informational and technical barriers are of similar complexity, many of the

informational barriers require more advanced assistance from the help desk.

The last category is an amalgam of sometimes very abstract barriers. Included

are issues ranging from “Why would I be interested?” to the correct use of probabilities

and data as evidence. The help needed here requires a great deal of staff

expertise.

Within each barrier category, specific questions are largely dependent on how

much experience the user has had with the web in general and, more specifically,

with TRAC’s system. Novice users don’t experience the same problems that more

experienced users encounter. However, users of all experience levels encounter, to

some degree, all types of barriers. For example, new users who do some initial

exploring generally find that the data mining tools are very easy to use. They

are readily able to point and click and get “numbers” in return for their efforts.

They find it harder to figure out what those numbers mean, or why they might be

interested. Many first time users lack confidence in deciding which information is

most relevant to them.

On the other hand, users with more experience may know exactly what they are

looking for and why they need the information. Many of these intermediate users

have precise questions they want to answer but are unsure how best to generate

the exact information required. Lastly, power users tend to understand exactly the

questions to ask and how to generate the information to answer those questions.

While they may encounter occasional problems with use of the advanced tools,

their major barriers often relate to the conclusions that can be drawn.

As an aid to designing appropriate help facilities, the barriers, as faced by differently

experienced users, have been organized into a simple matrix. Table 1 below

presents the matrix along with sample questions in each category.

6. Using AI to overcome barriers

Currently TRAC works with users individually to answer questions and solve problems.

As long as the numbers of users remain small, existing staff members are

able to manage the volume of help desk calls. However, as users grow not only in

numbers, but more importantly, in expertise, the types of questions presented to the

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 271

Table I. Taxonomy of barriers to the use of TRAC

help desk may require more knowledgeable staff. As a result, TRAC is exploring

ways to utilize Artificial Intelligence to provide some of the assistance to users that

is now being supplied by senior staff members.

Many of the barriers being handled by the help desk are not unique to the users

of TRAC’s system. They are, in fact, similar to problems that have been solved

through the application of various existing AI technologies. For example, a rulebased

system that would assist users in diagnosing and solving their own technical

problems could be modeled after the myriad diagnosis systems that currently exist.

(Examples of diagnosis systems can be found in basic texts such as Durkin 1994

and Turbin 2001a) A natural language interpreter could translate informational

questions from novice and intermediate users into a syntax that could query the

warehouse directly. TRAC has previously investigated a natural language interface

(Roberge 1994), and continues to explore many other avenues as well.

In Table II below, user assistance solutions that have been used successfully

in other contexts have been mapped to the matrix of TRAC user barriers. The

remaining shaded area represents those problems that are unique to TRAC. Interestingly,

these are also the areas that require the most specialized staff expertise.

Therefore developing AI solutions that allow users to solve their own conceptual

problems would be especially valuable. The remainder of the paper will focus

on how Artificial Intelligence can be used to enhance TRAC’s data resources for

lawyers.

6.1. SYSTEM OF THE FUTURE

Developing a system that will meet the informational needs of power users and the

conceptual needs of all users, as depicted in the shaded area of Table II, provides

an exciting opportunity to build on existing resources. TRAC’s data warehouse,

data mining tools, user menus and screens, etc. (i.e., the metadata discussed pre-

272 LINDA ROBERGE ET AL.

Table II. Application of standard AI technology to barrier reduction

viously) already have considerable built-in knowledge about statistical procedures,

the organization of the U.S. government, the United States Code of laws, etc. This

knowledge is sometimes encapsulated as small systems of rules and at other times

exists within the code that processes the user requests. Clearly this knowledge is

available to be tapped to support users in more abstract problems.

Currently, TRAC is looking for funding to develop an Expert System in the

form of an Automated Case Advisor aimed at the practicing lawyer. The system

as envisioned would be developed in two separate but interconnected modules

each of which would take advantage of the existing knowledgebase. The first and

least complex of the Automated Case Advisor modules would meet the conceptual

needs of novice and intermediate users. It would provide information of the type

already being generated by more sophisticated users. A detailed example of what

is envisioned is presented in Appendix B.

The second module would address the case needs that lawyers have for prediction,

planning, and diagnosis. For example, this module would help lawyers answer

the following types of questions:

Based on cases similar to mine, how likely will I be to win my case?

What have been the critical factors in other cases?

What are the weaknesses in my case?

What supporting evidence is needed in order to make my argument valid?

In addition to drawing on the data warehouse and existing knowledgebase,

the second module would benefit greatly from more detailed data (which TRAC

is currently seeking) about each case. Furthermore, advanced statistical knowledge

would need to be encoded such as what statistics were appropriate to use,

what the appropriate controls were, and what limitations the data imposed on the

conclusions that could be drawn.

As envisioned, extending TRAC’s scarce expert resources with an Automated

Case Advisor would be well suited for expert system technology for the following

reasons. First, because TRAC’s focus is limited to data related to U.S. federal court

practice, the system wouldn’t need to incorporate state laws and jurisdictional

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 273

issues. This makes the domain complex enough that an expert system would be

worth developing yet restricted enough to be workable. This balance is not always

easily achieved. Second, data that provides the underlying foundation for

the system exists. TRAC currently has in its warehouse source data sufficient to

complete Module 1; FOIA suits are underway to obtain access to more detailed

data from new sources for Module 2. Third, both the legal and statistical expertise

exists and can be verbalized. In this case, the more intuitive legal insights and

expertise that might be difficult to verbalize would be replaced by information

generated by data-analytical processes. Finally, the system would augment both

statistical and legal expertise, both of which are rare and expensive. To the extent

that an Automated Case Advisor could replace a lawyer or statistician on the help

desk, the payoff could be substantial. (See, for example, Ignizio 1991 and Turban

2001b for discussions of problem characteristics that are suitable for Expert System

development.)

In addition to alleviating help desk congestion, TRAC foresees other benefits

accruing from development of an Automated Case Advisor. Because the user interface

would guide users through an interactive process of information gathering, the

users need only enter information they know. For novice users, this would enable

them to get information they may not have realized existed. In other words, the

Automated Case Advisor could help new users envision the possibilities that exist

for obtaining relevant information. It would also help them to learn the capabilities

of the system and move to new levels of expertise.

7. Discussion: The impact of new or previously inaccessible information

TRAC is one example of how lawyers and other non-analysts can be provided

with an entrée to previously inaccessible information. TRAC may be at the cutting

edge of the electronic dissemination of information, but it is not alone. Moreover,

every advancement that TRAC makes in finding, packaging and disseminating information

is likely to raise the standard (albeit slowly) across the field. As more

information becomes available, the practicing lawyer needs to consider how the

increased availability may impact the best practice standards of the profession.

The kind of information we have been discussing here has only been available

through what we generally call “experience”. The legal rules identified in

the statutes and cases establish the rights and duties that apply in a particular

situation. However, the letter of the law in the books is often different from the

operation of the law in practice. The practicing lawyer’s experience in past cases

often provides him/her with insights about the nature of the differences between

theory and practice.

Experience is not equally available to all lawyers. A system that enabled lawyers

to access some or all of these insights would be desirable so that clients of all

lawyers have the benefit of such information.

274 LINDA ROBERGE ET AL.

In addition, insights can be misleading. So, some confirmation of such information

would desirable. A recent study, for example, attempted to test the widely

shared belief that defendants are more successful than plaintiffs in getting adverse

trial outcomes reversed on appeal (Cox 2002). Basically, the study wanted to know

what percentage of plaintiff-appeals cases were decided for the plaintiff, and what

percentage of defendant-appeals cases were decided for the defendant.

Individual cases are available for free from a variety of sites including court

sites (e.g. http://www.uscourts.gov/) and the Cornell Law School Legal Information

site (http://www.law.cornell.edu/), but the sites are not set up to permit users

to ask questions of the kind raised in the appellate success study. Nor can users ask

questions about regional variations (if any), about variations over time (particularly

as the court membership changes), or about differences in success rates for various

kinds of cases (are plaintiffs significantly more successful on appeal in employment

cases than in insurance cases?).

The information gleaned from answers to such questions can help the practitioner

to evaluate the more traditional legal rule information contained in the text

of the statutes and cases. With access to both kinds of information, the individual

practitioner may be able to improve the advice to be given to clients about whether

or not to appeal in a particular case.

Although TRAC’s data warehouse is limited to federal information, the points

made above are equally applicable to other levels of information. For example,

understanding the investigative and prosecutorial policies of the New York State

Attorney General’s office could be very useful to the lawyer who is representing

a client in a matter involving the AG’s office. Yet, the AG’s web site

(http://www.oag.state.ny.us/home.html) displays limited information about its investigative

and prosecutorial activities. One section describes the functions of

various departments. Another contains press releases that describe some investigative

and prosecutorial activities. A site user could sift through the press releases

and count the incidence of particular investigations or prosecutions. But would the

results be complete or meaningful? Some smaller investigations or prosecutions

may not be memorialized in a press release. The press releases may report the prosecutions

in a particular matter, but the convictions, acquittals or other dispositions

of the prosecutions may not make it into a press release.

There is also another major group of seldom-considered legal practitioners –

the lawyers who are responsible for administering the law. At the federal level this

group includes the United States Attorneys who have an important supervisory role

in the functioning of criminal and civil systems within each of the judicial districts

and the heads of the numerous divisions within the Justice Department.

Conversations with a number of former department executives and U.S. attorneys

strongly suggest that many of them focus on the handling of individual cases.

While such an emphasis sometimes is justified, it can also impede the consideration

of broader administrative questions. Does the allocation of assistant U.S. attorneys

around the country represent real needs or is the process sometimes influenced

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 275

by irrelevant political considerations? How well does the distribution of criminal

referrals coming from the investigative agencies meet the needs of the district?

Given the nature of the district, are the agencies paying too little or too much

attention to white-collar crime, or official corruption or immigration problems?

To answer such questions, the lawyer/administrator needs information about the

actual number of assistant U.S. attorneys in each district and the number of assistants

in relation to the district’s population. The lawyer/administrator also should be

able to generate appropriate statistical summaries so a variety of comparisons can

be made about the performance of the federal agencies working in one area and

how this performance ranks with other districts.

Although the federal lawyer/administrator can obtain certain raw counts on the

web site maintain by the Executive Office for United States Attorneys, it does not

appear that more sophisticated material – even simple per capita rates and percents

– is easily available to help them be effective managers.

Information about how the law works in practice can be a valuable component

of the practicing lawyer’s advice. Governmental and other web sites are constantly

expanding the kinds of operational and other information available and reducing

the burdens of accessing that information. However, the sites usually stop short of

providing a full range of the data-mining and other user tools that would enable

the practicing lawyer to correctly assess the environment in which advice is to be

given.

TRAC is a work in progress. It is continuously expanding both the kinds of data

mining tools provided, as well as the breadth and depth of the data. As other sites

copy and build upon the TRAC model, the range of available information is likely

to expand exponentially thus increasing the impact on the practice of law. If data

about the summons, service, and disqualification of grand and petty jurors becomes

available on line, for example, could cases for discrimination at various points in

the process be made more easily and cheaply than when the claimant had to pay

for the collection and analysis of the data? Finally, could the new information even

create new clauses of action?

8. Conclusion

Information technology has had and continues to have a huge impact on our work

and the way in which we do it. The practice of law is no exception. Previously,

veteran attorneys have had to rely on their experience with the workings of the legal

system to design strategies for handling individual cases. While experience will

always play an important role in this regard, advanced technology, which makes it

possible to access and use immense stores of electronic data, can now offer another

alternative.

In this paper we have presented a model for making previously inaccessible

information about the U.S. federal government available to lawyers. The model,

as implemented by Transactional Records Access Clearinghouse (TRAC), uses a

276 LINDA ROBERGE ET AL.

data warehouse and specially designed data mining tools to provide access to a

wide variety and vast amount of federal data. The media, public interest groups,

concerned citizens, and even the government itself are already using TRAC’s web

site. Users need not be trained data analysts to use the data mining tools, although

experienced analysts often choose to use them because they are both powerful and

easy to use.

Despite the uncomplicated point and click interface, lawyers and other subscribers

have sometimes experienced problems making full use of TRAC’s system.

Problems range from knowing where to click to envisioning how the information

could be useful. Currently, TRAC staff members work with subscribers individually

to solve their problems. However, this paper has proposed an expert system

that could automate the help desk. The expert system, which is visualized as an

Automated Case Advisor, would present information directly relevant to lawyers

as well as help them learn the capabilities of the system.

As additional producers of data adopt TRAC’s model, the amount of information

available is likely to increase dramatically. As a result, attorneys undoubtedly

will witness changes to the way their profession is practiced, how the law is administered,

and the types of research they are expected to do. Cases that previously

would have required costly collection and analysis of data will be brought more

easily. Advice that was previously based on hunches alone will be grounded in

data. The overall records of judges and prosecutors, now available for all to see,

will provide an image of the workings of the U.S. federal system of jurisprudence

that would not exist without information technology. In short, we can expect the

impact to be dramatic.

Note

1 Portions of this paper are based on “Data Warehouses and Data Mining Tools for the Legal Profession:

Using Information Technology to Raise the Standard of Practice” by Roberge, L., Long, S.,

and Burnham, D. forthcoming in the Syracuse Law Review, Volume 52, Book 4.

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 277

Appendix A: Examples of user interface for TRAC’s data mining tools

Figure 1A. Criminal express tool (white color crime – declinations).

Figure 2A. Criminal express tool output (white color crime – declinations).

278 LINDA ROBERGE ET AL.

Figure 3A. Civil going deeper tool.

Figure 4A. Fourth level table output from civil going deeper tool.

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 279

Figure 5A. Analyzer tool – creating data slice.

Appendix B: Automated Case Advisor dialog (Module 1)

The sample dialog below presents one of many possible interactions that a lawyer

could have with the Automated Case Advisor. (For alternate paths, see the tree

structure in Figure 1 below.) At the outset of the consultation, the Automated Case

Advisor would ask for the focus that the lawyer would like. In all cases, a list of

choices would be presented to the user. Where appropriate, the user may also enter

the choice of unknown.

Automated Case Advisor questions Lawyer answers

Focus of Consult? Settlement Negotiation

What type of case? Criminal

What stage? Referral

Which agency investigated? FBI

What is the nature of the case? Health Care Fraud

What is the lead charge? unknown

What is the referral district? Penn, W.

Has the prosecution been filed? No

Who is the prosecutor handling the case? Jane Doe

Who is the judge? Unknown

Figure 1B. Sample dialog with automated case advisor.

280 LINDA ROBERGE ET AL.

From here, the Automated Case Advisor would prepare a report in the form of

an html page based on answers that had been input. This allows the user to access as

much or as little information as needed. From here it will also be possible to change

the input thus allowing the lawyer to do “what-if” analyses with different scenarios.

Based on the example interaction above, a report might include the information

shown in Figure 2 below.

Figure 2B. Case advisor top level tree structure (Module 1).

TECHNOLOGY AND THE CHANGING PRACTICE OF LAW 281

Figure 3B. Case advisor interactive output.

282 LINDA ROBERGE ET AL.

References

Ballou, D. P. T. and Giri K. (1999). Enhancing Data Quality In Data Warehouse Environments.

Communications of the ACM 42(1): 73–78.

Benander, A. Benander, B., and Fadlalla, A. (2000). Data Warehouse Administration and Management.

Information Systems Management 17(1): 71–80.

Berry, M. and Linoff, G. (1997). Data Mining Techniques. John Wiley and Sons, Inc.: New York.

Cox, G. (2002). Voir Dire: Those Appealing Defendants. The National Law Journal 24(19): citing

Schwab, Eisenberg, and Claremont. (forthcoming) University of Illinois Law Review,

Plaintiphobia.

Durkin, J. (1994). Expert Systems Design and Development. Prentice Hall: Englewood Cliffs, NJ.

Dyche, J. (2000). e-Data: Turning Data into Information with Data Warehousing. Addison-Wesley:

Reading, MA.

Friedland, L. (1998). Accessing the Data Warehouse: Designing Tools to Facilitate Business

Understanding. Interactions, 25–36.

Hall, O. P. Jr. (2000). Mining the Store. Journal of Business Strategy 22(2): 24–27.

Ignizio, J. (1991). An Introduction to Expert Systems. Mc-Graw-Hill: New York.

Jarke, M. Lenzerini, M. Vassiliou, Y., and Vassiliadis, P. (2000). Fundamentals of Data Warehouses.

Springer-Verlag: New York.

Kimbal, R. and Ross, M. (2002). The Data Warehouse Toolkit, 2nd Edition. John Wiley and Sons:

New York.

MacDonald, M. and Oettinger, A. (2002). Information Overload. Harvard International Review, 44–

48.

Mallach, E. (2000). Decision Support and Data Warehouse Systems. McGraw-Hill: Boston.

Roberge, L. (1994). The Impact of a Natural Language Interface on Barriers to Information Access.

Ph.D. diss., School of Management, Syracuse University, Syracuse, New York.

Roberge, L. Long, S., and Burnham, D. (2002). Data Warehouses and Data Mining Tools for the

Legal Profession: Using Technology to Raise the Standard of Practice. Forthcoming in Syracuse

University Law Review 52: Book 4.

Toffler, A. (1970). Future Shock. Random House, Inc.: New York.

Turban, E. and Aronson, J. (2001a). Decision Support Systems and Intelligent Agents, 6th ed.

Prentice Hall: Upper Saddle River, NJ.

Turban, E. McLean, E., and Wetherbe, J. (2001b). Information Technology for Management. John

Wiley and Sons: New York.

Witten, I. and Frank, E. (2000). Data Mining. Academic Press: San Diego, CA.

 

 

Retirado de: http://www.kluweronline.com