Technology
and the changing practice of law: An
entrée
to previously inaccessible information via
LINDA
ROBERGE1, SUSAN LONG1, PATRICIA HASSETT2 and DAVID
BURNHAM3
1TRAC School of Management,
Syracuse University, USA
E-mail:
lroberge@syr.edu; suelong@syr.edu
2College of Law, Syracuse
University, USA
E-mail:
phassett@law.syr.edu
3S. I. Newhouse School of
Communications, Syracuse University, USA
E-mail:
burnham@epic.org
Abstract. The proliferation of electronic
databases is raising some important questions about how
the evolving access to new or
previously inaccessible information is likely to change the practice
of law. This paper discusses
TRAC, an interesting electronic source of previously inaccessible information
that is currently used by
members of the media, public interest groups, lawyers, and the
federal government. Summaries,
reports, and snapshots of TRAC’s data can be accessed through a
series of public web sites.
TRAC’s subscription service allows users access to the data warehouse
and data mining tools (see
http://tracfed.syr.edu/info.html for more information). Additionally the
paper examines how AI can be
employed to assist for the legal profession in utilization of TRAC’s
data. Finally, it speculates
about how TRAC and other new electronic data sources may impact the
practice of law.
Key words: automated case advisor, data
mining, data warehouse, Freedom of Information Act,
statistical information, U.S.
Government
1. Introduction
Suppose for a moment that you
are about to appear for the first time in a U.S.
District court. You know the
law but you would also like a better picture of how
the system really works.
What’s the likelihood that your case will be dismissed?
If not dismissed, how likely
is your client to be acquitted? If not acquitted, what
are the typical prison
sentences? Who are the Assistant U.S. Attorneys? How good
are they? Are the salaries
high enough to keep the good ones there? You may have
all these questions and more,
but where would you find such information? Perhaps
your colleagues could relate
their experiences, but is there any hard data?
In this paper, we discuss how
new information technologies, specifically data
warehouses coupled with data
mining tools, can be used to make existing sources
262 LINDA ROBERGE ET AL.
of data available, accessible,
and useful to lawyers. In particular we demonstrate
how Transactional Records
Access Clearinghouse has created a data warehouse
and designed easy to use data
mining tools that now provide a picture of the U.S.
federal system of
jurisprudence that has never before been available.
As advanced information
technology makes more and more electronic information
accessible to non-technical
people, practicing attorneys need to keep an eye
on these developments and to
ask how the availability new information will impact
the practice of law.Will the
information impose new duties on the lawyer?Will the
lawyer who fails to canvas the
Internet for information relevant to a client matter be
guilty of malpractice in the
same way that a lawyer’s failure to know the law might
be malpractice? On the other
hand, will the information enable the lawyer to pursue
cases that would have been
prohibitively expensive because of data inaccessibility?
2. TRAC: The power of
information1
Transactional Records Access
Clearinghouse, more commonly known by its acronym
TRAC, is a research center at
Syracuse University that has developed
an application that can serve
as a model for making information available. A
team of professionals with
backgrounds in statistics, journalism, government records,
information systems, web
development, and statistical programming serve
as TRAC’s primary staff
members. In addition, public interest lawyers volunteer
their services to obtain raw
data from the federal government via the Freedom of
Information Act. The purpose
of TRAC is to provide the American people, as well
as institutions of oversight
like Congressional committees, news organizations,
public interest groups and
scholars, with the information they need to fairly judge
the performance of the federal
government. (See http://tracfed.syr.edu/info.html for
more information.)
Researchers at TRAC achieve
the center’s purposes through the creation of
a data warehouse and specially
designed data mining tools. Users are able to
access the information via the
internet using an ordinary web browser such as
Netscape or Internet Explorer.
The data warehouse contains data from many different
sources including the
Executive Office for United States Attorneys in the
Justice Department, the
Administrative Office of United States Courts, the Of-
fice of Personnel Management,
the Internal Revenue Service, the Environmental
Protection Agency, the Census
Bureau, and a range of other specialized federal
agencies. Areas covered
include criminal enforcement, civil actions, administrative
enforcement by the IRS,
federal staffing, records of judges and prosecutors, federal
expenditures and more.
As indicated by the center’s
name, TRAC staffers always try to obtain transactional
information about individual
matters rather than summary data. Concerning
criminal enforcement, for
example, TRAC acquires data from the Executive Office
for United States Attorneys
about each referral for prosecution. Data elements
about each referral include
where and when the referral was filed, which agency
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 263
made the referral, the lead
charge, the initials of the assistant U.S. Attorney handling
the matter, whether it was
declined for prosecution and why, whether the
matter resulted in an
indictment, the initials of the presiding judge, the outcome,
and if convicted, the
sentence. TRAC continually submits FOIA requests to obtain
additional data from new
sources.
By accessing the data
warehouse, users can see the details about each individual
matter. The real power of the
warehouse, however, is in the data mining tools
specifically developed by TRAC
programmers using SAS, a powerful analysis software
suite. These tools allow users
to generate tables, graphs and maps to answer
a wide variety of questions
about how the law is enforced in different parts of the
country and how the
enforcement of different laws has changed over time. The list
of potential questions is
endless. The ability to find relevant information quickly
and easily gives a picture of
the workings of the US federal government that would
be impossible to get by
looking at the details of each action individually.
3. Understanding the technology
In TRAC’s application, which
can serve as a model for others, the transformation
from data to information and
subsequent publication requires three different
information technologies,
namely data warehouses, data mining tools, and the internet.
While the vast majority of
readers are likely to be familiar with the internet,
data warehouses and the
statistical tools that allow users to analyze or “mine” the
data are less well known.
3.1. THE DATA WAREHOUSE
Data warehouses are not the
same thing as databases. A database is a collection of
information that records and
tracks the individual activities or transactions of an organization.
For example, when a government
employee is hired, information about
the employee and his/her job
is recorded in a transactional database. As information
about the employee changes
(e.g., salary, work schedule, or grade) the database is
updated. The software that
manages databases continues to decrease in cost and
increase in user friendliness.
As a result, the number of transactional databases
has risen dramatically. When
the numbers of databases proliferate, we end up with
both “information overload” as
depicted in Alvin Toffler’s Future Shock, and with
isolated collections of data
often referred to as “information silos” (Toffler 1970).
The greater the amount of
data, the greater the number of silos, and the harder it
becomes to use data related to
individual parts of an organization to understand the
organization in its entirety.
Information overload and the resulting compartmentalization
of data has been cited as one
of the reasons that multiple clues about the
September 11 terrorism attack
were never connected (see for example MacDonald
2002).
264 LINDA ROBERGE ET AL.
One purpose of data warehouses
is to enable connections among the individual
silos of information that
exist in most large organizations including the U.S.
federal government (Jarke
2000). Data warehouses facilitate these connections because
they differ from transactional
databases in several significant ways. First,
warehouses consist of one or
more transactional databases that have been reengineered
so that they can be
integrated. Second, in addition to transactional records,
warehouses may contain
summarized data that can also be integrated with the
transactional data. And
finally, warehouses contain historical data that is updated
periodically, often quarterly
or yearly, rather than “live” data that is constantly
being updated in real-time. “A
distinguishing characteristic of warehoused data is
that it is used for decision
making rather than for operations” (Ballou 1999).
However, the integration of
historical data via data warehouses brings its own
problems. Codes that designate
categories may change over the years, the boundaries
of geographic districts may
change, new codes may be added, others dropped,
and so on. There are a number
of warehousing packages that purport to diminish
these problems. Nevertheless,
when compared to database software, warehouse
software is neither
inexpensive nor easy to use. Despite the existence of new
warehousing tools, “. . .
[the] creation, maintenance, and daily administration of
data warehouses are still formidable
tasks that are far from being fully automated”
(Benander 2000).
To build the data warehouse,
the team of researchers at TRAC begin by searching
through government manuals,
websites and other such sources to identity
relevant systems of records and
transactional databases maintained by different
agencies. Based on these
leads, TRAC makes requests for specific data sets and
all of the agency
documentation describing the details of what is covered and how
the information is organized.
The requests usually are made under the Freedom
of Information Act (FOIA).
When release of the data entails a lawsuit and court
decision, this beginning step
can require a great deal of time and expense.
Once the data sets and
documentation are in hand, statistical and other kinds
of checks are made to test the
completeness and reliability of the information
that has been provided. For
example, have code tables been provided? Are all
codes appearing in the
database listed in the code tables? Are there codes in the
tables that don’t appear in
the data? Are some fields not used? Are there internal
inconsistencies – e.g., does a
case end before it starts? Data from different sources
about similar events may be
compared or merged as a further check on data set
reliability. While a few parts
of the reliability assessment are generic across data
sources and can be automated,
most of the process requires human intervention.
(For a more complete
discussion of data quality in a data warehouse environment,
see for example Mallach 2000.)
If the overall quality of a data set is poor, it will
not be merged into the data
warehouse. On the other hand, if only a limited number
of fields are found to be
unreliable, the data may be merged but the questionable
fields will not be used for
analyses. When this happens, users are warned about the
shortcomings.
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 265
Most transactional data
require augmentation before being loaded into data
warehouses so that data mining
tools can be used effectively (Berry 1997). Therefore,
the linking, grouping, and
classification variables that will be used to place the
data into
geo-political-temporal context must be developed and added to the data.
Additionally, numerous
performance criteria are defined, indicators developed, and
these too become part of the
warehouse. Together these related sources of data,
called metadata, form a vital
part of the data warehouse because they contain the
domain knowledge that makes
the data useful (Dyche 2000; Kimbal 2002; Whitten
2000). Because the contextual
needs of various users may be quite different
depending on their interests,
TRAC attempts to provide as much related metadata
as possible including
geography, population, time trends, constant/real dollars, etc.
Recently, TRAC has begun the
ambitious task of incorporating new data into its
data warehouse on a monthly
basis.
The size of TRAC’s data
warehouse is enormous taking up approximately 300
gigabytes of storage space.
For example, because there were more than 140,000
criminal referrals for
prosecution in a recent year and the online data go back
to 1986, the information in
this area alone is extensive. (The Justice Department
recently estimated that they
have supplied more than 25 million records to TRAC!)
In addition, TRAC’s
enforcement data cover civil matters – where the government
is either the plaintiff or the
defendant – and administrative actions by the IRS.
Along with the enforcement data
there is information on staffing going back to
1975, federal spending going
back to 1993, federal judges going back to 1986, and
federal prosecutors going back
to 2001.While TRAC constantly strives to automate
the production processes for
adding new data from existing sources, preparing data
from new sources continues to
require the skills of highly trained statisticians, data
analysts, and computer
programmers in addition to individuals with expertise in
the subject matter.
3.2. THE DATA MINING TOOLS
As daunting as establishing a
data warehouse may be, creating information from
the warehouse is yet another
formidable task. Given the size of most data warehouses,
finding information amid all
the data can be a major problem. This is where
the data mining tools come
into play. “The purpose of data mining is to explore the
data warehouse looking for
trends, relationships, and outcomes” (Hall 2001). The
objective is to find the
patterns that will provide a coherent unified view of the organization,
to discover what is happening
within the organization, and to place this
information into a context
that will make it understandable and usable for action.
This can only be accomplished
by merging information from several databases.
For example, knowing how much
money was spent in a particular district will be
much more meaningful if you
can integrate population data to enable comparisons
between districts, or constant
dollar calculations to enable comparisons over time.
It is the integration of
separate databases, including the metadata, that provides the
266 LINDA ROBERGE ET AL.
geo-political-temporal
information necessary for interpretation. For integration, we
need tools that allow users to
specify what information they want without needing
to understand the complexities
behind how the data is organized, combined, and
how the information was
generated. (Friedland 1998) While some data mining
tools explore data
automatically, other types of tools operate in a semi-automatic
fashion in that they allow
users to direct how and where to search for the trends
and relationships (Berry 1997;
Witten 2000).
TRAC has developed three
different types of semi-automatic data mining tools
that enable users to direct
the analysis of the warehoused data. Although experienced
statisticians historically
have been the ones to do data analysis, most users
don’t have access to such
expertise (Dyche 2000). Thus TRAC’s tools were created
to allow both novice and
advanced users to perform whatever data analysis they
need, regardless of their
statistical expertise.
The first tool is called
Express. As the name implies, this tool allows users to
quickly and easily produce
counts, averages, medians, and other specially computed
measures that are used to
generate rankings, comparisons, and trends. Users
can specify if they want the
information by district, agency, program area, or lead
charge/cause of action. For
IRS audits, users can also chose to have the information
produced by income class,
selection reason, and auditor type. Additionally, users
are able to indicate whether
they want the information returned to them in the form
of tables, graphs, or maps.
Because this tool is both powerful and easy to use, it is
frequently the tool of choice
for novices and power users alike.
Sometimes users need
multidimensional views of the data that facilitate comparisons
across groups, years,
organizational entities, etc. Data miners refer to the
this process as “slicing and
dicing” (Dyche 2000). To understand how one district
handles health care fraud, for
example, users may want to know how many of the
referrals get prosecuted
versus how many are declined. How does this compare
with other districts? Has this
changed over the years? To provide this capability,
TRAC has developed a second
tool called “Going Deeper”. This tool allows users
to focus on a particular stage
in the referral process and to generate performance
measures such as percentages,
rates relative to the population, and outcomes. Going
Deeper, as the name suggests,
provides a drill-down capability that enables users
to produce and view the data
as a series of linked tables that focus on increasingly
narrower subsets of data down
to a listing of the individual matters, or federal employee.
As with the Express tool,
Going Deeper is easy to use via a point and click
interface. Users with a wide
variety of experience find that it too is user-friendly
and capable of generating
complex information.
The most advanced tool is the
“Analyzer”. This tool allows users to specify a
particular slice of data that
is of interest to them, and to store their own unique
subset of data in a personal
“web locker”. From the web locker, the user can run
numerous types of
sophisticated analyses, the results of which can also be stored
in the web locker. As an
adjunct to Express and Going Deeper, Analyzer provides
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 267
users with the ability to
perform sophisticated analyses on any subset of data in
which they are interested.
Through the use of these
tools, users are able to literally “create” information
by entering the data warehouse
and analyzing the data contained in the individual
transactional records
concerning each matter. Although the tools are easy enough
for beginners to use, they are
powerful enough that many statisticians and data
analysts chose to use them
over traditional analysis software. Examples of the user
interfaces for these tools are
located in Appendix A.
3.3. THE WEB SITES
To provide access to the data
warehouse and the data mining tools, TRAC harnesses
the power of the Internet by
maintaining two broad categories of web sites. First is
a series of six free public
web sites that mostly focus on the criminal enforcement
activities of the Federal
Bureau of Investigation, the Immigration and Naturalization
Service, the Drug Enforcement
Agency, Bureau of Alcohol Tobacco and
Firearms, the Customs Bureau,
and the Internal Revenue Service. (Sites can be
entered from
http://trac.syr.edu/). TRAC’s IRS site also includes information about
IRS administrative
actions–audits, seizures, levies and liens, etc. Using data from
the data warehouse, the free
web sites offer pre-selected, but very extensive, views
of each agency’s enforcement
activities, both nationally and within individual
districts, along with graphs,
maps, and tables that highlight interesting findings
and trends over time. The free
sites also offer special studies on such subjects as
counter-terrorism enforcement
and long-term changes in federal staffing. The free
sites do not allow users
access to the data mining tools that would allow them to
tailor the information to meet
their own unique needs.
TRAC’s second offering
consists of a dynamic subscription site that provides
vastly more information as
well as access to the data mining tools. (Available at
http://tracfed.syr.edu/) In
the criminal area, for example, enforcement data can be
organized by statute,
district, Justice Department program category, and by virtually
any agency. In the civil area,
extensive information about the civil matters
processed by the U.S.
Attorneys where the government is the plaintiff or the defendant
can be organized by cause of
action, government role, or court type in
addition to district and
agency. Yet another area offers agency-by-agency staffing
information - from statistical
overviews by federal judicial district, state, county or
city down to the names and
salaries of individual employees. Recent additions
to this area include express
tools that allow users to compare and contrast the
records of individual federal
prosecutors and district court judges, and to obtain
listings of the individual
cases. Federal expenditure data, agency-by-agency and
program-by-program, provide
yet another perspective on the government.
TRAC’S uniquely comprehensive
collection of information has become an important
stop for those interested in
the actual operations of the federal government.
TRAC’s six public sites, for
example, are receiving “hits” at a rate of more than 5
268 LINDA ROBERGE ET AL.
million a year. Clients of
TRAC’s subscription service now number more than 300
individuals plus an
ever-growing number of libraries, congressional committees,
and law firms with site
licenses. Additionally, for data that has not yet been suf-
ficiently processed to be
included in the data warehouse, clients can commission
special pre-release
statistical runs.
4. Lawyers, data, and the TRAC
model
Without advanced information
technology, seasoned lawyers have relied on their
experience with the workings
of the legal system to make decisions about how
to handle a case and what
advice to give to their clients. Many law firms allocate
substantial research dollars
and staff time to costly efforts aimed at collecting anecdotal
information about the
functioning of the system in particular districts and the
professional proclivities of
individual judges and prosecutors. Unfortunately, experience
can be misleading. Are defendants
really more successful than plaintiffs in
getting adverse trial outcomes
reversed on appeal? Does one judge impose longer
sentences than another for
similarly situated defendants? How frequently does a
particular prosecutor decline
certain types of cases? Before giving advice based
on experience, it would be
nice to be able to confirm impressions, hunches, and
anecdotes with data. In fact,
as other sources of information become available, the
practicing lawyer needs to be
aware that such information is likely to raise the best
practice standards of the
profession.
The five scenarios below
present situations that lawyers might encounter where
existing data available
through TRAC could be helpful.
Scenario 1: It will cost X amount to bring
a civil suit against the government.
For this type of case, how
often does the court find in favor of the plaintiff? In
what proportion of cases are
monetary damages awarded? What is the average
amount of damages awarded?
Given the Assistant U.S. Attorney handling the
case, is the settlement likely
to be more or less favorable than average?
Scenario 2: You have a pending
environmental case and are negotiating with
a specific Assistant US
Attorney. You would like to obtain a listing of other
environmental matters handled
by this particular prosecutor so that you could
see which cases were declined
or dismissed, which went forward, and what
the outcomes were.
Scenario 3: Your client has been notified
that the IRS has decided to audit his
return. For this type of
audit, what percentage of cases is assessed additional
taxes and penalties? On
average, how much additional taxes are owed? If the
IRS offers a compromise, what
would constitute a reasonable offer?
Scenario 4: After an investigation, the
FBI has referred your client’s case
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 269
to the U.S. Attorney’s office
for prosecution. What proportion of the FBI’s
referrals gets declined
because there hasn’t been adequate preparation of the
case? If your client is
prosecuted, what is the record of this prosecutor in
obtaining jail time for this
type of charge? If the prosecutor offers a plea
bargain, should you advise
your client to accept?
Scenario 5: You are concerned about defending
a racketeering case because
you are afraid it will drag on
forever. For your district, how long has it taken to
prosecute other cases with
this lead charge? For this prosecutor? For the judge
who will be hearing the case?
5. Enhancing TRAC for lawyers
While TRAC is a fully
functional, powerful application used by many outside the
legal profession, it has not
yet come close to realizing its full potential for lawyers.
Two major categories of
problems currently stand in the way. The first category
relates to problems that are
experienced by the developer in adding relevant new
data to the data warehouse,
most notably cost. Developing a data warehouse, userfriendly
data mining tools, and
delivering the product to consumers is intellectually
challenging, technically
demanding, time consuming, and amazingly expensive.
When the extra burden
associated with FOIA lawsuits to obtain needed new sources
of data are added in, the cost
structure of the application becomes prohibitive. In
short, providing information is
a costly business. While vitally important, these
issues are beyond the scope of
this paper.
The second category of
problems that stand in the way of lawyers realizing
the full potential of TRAC
includes a number of related issues that many lawyers
encounter in attempting to use
what is, fundamentally, a statistical application.
After all, the qualities that
make a good lawyer aren’t necessarily the same ones
that foster statistical
thinking. Luckily, this category is more easily ameliorated
than the first.
An important point to consider
here is how lawyers differ from other users.
In some respects, lawyers are
more informed users. They understand how matters
move through the legal system
from the referral stage through final disposition.
They also are more familiar
with the laws than other users. In other respects, however,
lawyers differ very little
from their counterparts in other professions. They
tend to ask the same types of
questions and encounter the same types of problems.
Analyzing the questions
received by TRAC’s help desk creates a good picture of
ways in which TRAC can be
enhanced to assist lawyers with their data utilization
needs.
270 LINDA ROBERGE ET AL.
5.1. BARRIERS AND USER EXPERIENCE
The types of barriers faced by
users can be grouped into three general categories,
technical, informational, and
conceptual. The technical barriers relate to the problems
that users may encounter with
the use of TRAC’s data mining tools, printing,
or problems that are unrelated
to the data. In general these barriers tend to be related
to a user’s “web-savvy-ness”.
Users who surf the web with ease experience few
difficulties. For non-surfers,
knowing where to click can be a real problem.
The informational barriers are
those that relate in some way to the data, the
performance measures, or to
the statistics. Examples include meaning of codes,
what is covered, the
difference between the mean and the median, etc. While some
of the informational and
technical barriers are of similar complexity, many of the
informational barriers require
more advanced assistance from the help desk.
The last category is an
amalgam of sometimes very abstract barriers. Included
are issues ranging from “Why
would I be interested?” to the correct use of probabilities
and data as evidence. The help
needed here requires a great deal of staff
expertise.
Within each barrier category,
specific questions are largely dependent on how
much experience the user has
had with the web in general and, more specifically,
with TRAC’s system. Novice
users don’t experience the same problems that more
experienced users encounter.
However, users of all experience levels encounter, to
some degree, all types of
barriers. For example, new users who do some initial
exploring generally find that
the data mining tools are very easy to use. They
are readily able to point and
click and get “numbers” in return for their efforts.
They find it harder to figure
out what those numbers mean, or why they might be
interested. Many first time
users lack confidence in deciding which information is
most relevant to them.
On the other hand, users with
more experience may know exactly what they are
looking for and why they need
the information. Many of these intermediate users
have precise questions they
want to answer but are unsure how best to generate
the exact information
required. Lastly, power users tend to understand exactly the
questions to ask and how to
generate the information to answer those questions.
While they may encounter
occasional problems with use of the advanced tools,
their major barriers often
relate to the conclusions that can be drawn.
As an aid to designing
appropriate help facilities, the barriers, as faced by differently
experienced users, have been
organized into a simple matrix. Table 1 below
presents the matrix along with
sample questions in each category.
6. Using AI to overcome
barriers
Currently TRAC works with
users individually to answer questions and solve problems.
As long as the numbers of
users remain small, existing staff members are
able to manage the volume of
help desk calls. However, as users grow not only in
numbers, but more importantly,
in expertise, the types of questions presented to the
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 271
Table I. Taxonomy of barriers to the use
of TRAC
help desk may require more
knowledgeable staff. As a result, TRAC is exploring
ways to utilize Artificial
Intelligence to provide some of the assistance to users that
is now being supplied by
senior staff members.
Many of the barriers being
handled by the help desk are not unique to the users
of TRAC’s system. They are, in
fact, similar to problems that have been solved
through the application of
various existing AI technologies. For example, a rulebased
system that would assist users
in diagnosing and solving their own technical
problems could be modeled
after the myriad diagnosis systems that currently exist.
(Examples of diagnosis systems
can be found in basic texts such as Durkin 1994
and Turbin 2001a) A natural language
interpreter could translate informational
questions from novice and
intermediate users into a syntax that could query the
warehouse directly. TRAC has
previously investigated a natural language interface
(Roberge 1994), and continues
to explore many other avenues as well.
In Table II below, user
assistance solutions that have been used successfully
in other contexts have been
mapped to the matrix of TRAC user barriers. The
remaining shaded area
represents those problems that are unique to TRAC. Interestingly,
these are also the areas that
require the most specialized staff expertise.
Therefore developing AI
solutions that allow users to solve their own conceptual
problems would be especially
valuable. The remainder of the paper will focus
on how Artificial Intelligence
can be used to enhance TRAC’s data resources for
lawyers.
6.1. SYSTEM OF THE FUTURE
Developing a system that will
meet the informational needs of power users and the
conceptual needs of all users,
as depicted in the shaded area of Table II, provides
an exciting opportunity to
build on existing resources. TRAC’s data warehouse,
data mining tools, user menus
and screens, etc. (i.e., the metadata discussed pre-
272 LINDA ROBERGE ET AL.
Table II. Application of standard AI
technology to barrier reduction
viously) already have
considerable built-in knowledge about statistical procedures,
the organization of the U.S.
government, the United States Code of laws, etc. This
knowledge is sometimes
encapsulated as small systems of rules and at other times
exists within the code that
processes the user requests. Clearly this knowledge is
available to be tapped to
support users in more abstract problems.
Currently, TRAC is looking for
funding to develop an Expert System in the
form of an Automated Case
Advisor aimed at the practicing lawyer. The system
as envisioned would be
developed in two separate but interconnected modules
each of which would take
advantage of the existing knowledgebase. The first and
least complex of the Automated
Case Advisor modules would meet the conceptual
needs of novice and
intermediate users. It would provide information of the type
already being generated by
more sophisticated users. A detailed example of what
is envisioned is presented in
Appendix B.
The second module would
address the case needs that lawyers have for prediction,
planning, and diagnosis. For
example, this module would help lawyers answer
the following types of
questions:
• Based on cases similar to mine, how likely will
I be to win my case?
• What have been the critical factors in other
cases?
• What are the weaknesses in my case?
• What supporting evidence is needed in order to
make my argument valid?
In addition to drawing on the
data warehouse and existing knowledgebase,
the second module would
benefit greatly from more detailed data (which TRAC
is currently seeking) about
each case. Furthermore, advanced statistical knowledge
would need to be encoded such
as what statistics were appropriate to use,
what the appropriate controls
were, and what limitations the data imposed on the
conclusions that could be
drawn.
As envisioned, extending
TRAC’s scarce expert resources with an Automated
Case Advisor would be well
suited for expert system technology for the following
reasons. First, because TRAC’s
focus is limited to data related to U.S. federal court
practice, the system wouldn’t
need to incorporate state laws and jurisdictional
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 273
issues. This makes the domain
complex enough that an expert system would be
worth developing yet
restricted enough to be workable. This balance is not always
easily achieved. Second, data
that provides the underlying foundation for
the system exists. TRAC currently
has in its warehouse source data sufficient to
complete Module 1; FOIA suits
are underway to obtain access to more detailed
data from new sources for
Module 2. Third, both the legal and statistical expertise
exists and can be verbalized.
In this case, the more intuitive legal insights and
expertise that might be
difficult to verbalize would be replaced by information
generated by data-analytical
processes. Finally, the system would augment both
statistical and legal
expertise, both of which are rare and expensive. To the extent
that an Automated Case Advisor
could replace a lawyer or statistician on the help
desk, the payoff could be
substantial. (See, for example, Ignizio 1991 and Turban
2001b for discussions of
problem characteristics that are suitable for Expert System
development.)
In addition to alleviating
help desk congestion, TRAC foresees other benefits
accruing from development of
an Automated Case Advisor. Because the user interface
would guide users through an
interactive process of information gathering, the
users need only enter
information they know. For novice users, this would enable
them to get information they
may not have realized existed. In other words, the
Automated Case Advisor could
help new users envision the possibilities that exist
for obtaining relevant
information. It would also help them to learn the capabilities
of the system and move to new
levels of expertise.
7. Discussion: The impact of
new or previously inaccessible information
TRAC is one example of how
lawyers and other non-analysts can be provided
with an entrée to previously
inaccessible information. TRAC may be at the cutting
edge of the electronic
dissemination of information, but it is not alone. Moreover,
every advancement that TRAC
makes in finding, packaging and disseminating information
is likely to raise the
standard (albeit slowly) across the field. As more
information becomes available,
the practicing lawyer needs to consider how the
increased availability may
impact the best practice standards of the profession.
The kind of information we
have been discussing here has only been available
through what we generally call
“experience”. The legal rules identified in
the statutes and cases establish
the rights and duties that apply in a particular
situation. However, the letter
of the law in the books is often different from the
operation of the law in
practice. The practicing lawyer’s experience in past cases
often provides him/her with
insights about the nature of the differences between
theory and practice.
Experience is not equally
available to all lawyers. A system that enabled lawyers
to access some or all of these
insights would be desirable so that clients of all
lawyers have the benefit of
such information.
274 LINDA ROBERGE ET AL.
In addition, insights can be
misleading. So, some confirmation of such information
would desirable. A recent
study, for example, attempted to test the widely
shared belief that defendants
are more successful than plaintiffs in getting adverse
trial outcomes reversed on
appeal (Cox 2002). Basically, the study wanted to know
what percentage of
plaintiff-appeals cases were decided for the plaintiff, and what
percentage of
defendant-appeals cases were decided for the defendant.
Individual cases are available
for free from a variety of sites including court
sites (e.g.
http://www.uscourts.gov/) and the Cornell Law School Legal Information
site
(http://www.law.cornell.edu/), but the sites are not set up to permit users
to ask questions of the kind
raised in the appellate success study. Nor can users ask
questions about regional
variations (if any), about variations over time (particularly
as the court membership
changes), or about differences in success rates for various
kinds of cases (are plaintiffs
significantly more successful on appeal in employment
cases than in insurance
cases?).
The information gleaned from
answers to such questions can help the practitioner
to evaluate the more
traditional legal rule information contained in the text
of the statutes and cases.
With access to both kinds of information, the individual
practitioner may be able to
improve the advice to be given to clients about whether
or not to appeal in a
particular case.
Although TRAC’s data warehouse
is limited to federal information, the points
made above are equally
applicable to other levels of information. For example,
understanding the
investigative and prosecutorial policies of the New York State
Attorney General’s office
could be very useful to the lawyer who is representing
a client in a matter involving
the AG’s office. Yet, the AG’s web site
(http://www.oag.state.ny.us/home.html)
displays limited information about its investigative
and prosecutorial activities.
One section describes the functions of
various departments. Another
contains press releases that describe some investigative
and prosecutorial activities.
A site user could sift through the press releases
and count the incidence of
particular investigations or prosecutions. But would the
results be complete or
meaningful? Some smaller investigations or prosecutions
may not be memorialized in a
press release. The press releases may report the prosecutions
in a particular matter, but
the convictions, acquittals or other dispositions
of the prosecutions may not
make it into a press release.
There is also another major
group of seldom-considered legal practitioners –
the lawyers who are
responsible for administering the law. At the federal level this
group includes the United
States Attorneys who have an important supervisory role
in the functioning of criminal
and civil systems within each of the judicial districts
and the heads of the numerous
divisions within the Justice Department.
Conversations with a number of
former department executives and U.S. attorneys
strongly suggest that many of
them focus on the handling of individual cases.
While such an emphasis
sometimes is justified, it can also impede the consideration
of broader administrative
questions. Does the allocation of assistant U.S. attorneys
around the country represent
real needs or is the process sometimes influenced
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 275
by irrelevant political
considerations? How well does the distribution of criminal
referrals coming from the
investigative agencies meet the needs of the district?
Given the nature of the
district, are the agencies paying too little or too much
attention to white-collar
crime, or official corruption or immigration problems?
To answer such questions, the
lawyer/administrator needs information about the
actual number of assistant
U.S. attorneys in each district and the number of assistants
in relation to the district’s
population. The lawyer/administrator also should be
able to generate appropriate
statistical summaries so a variety of comparisons can
be made about the performance
of the federal agencies working in one area and
how this performance ranks
with other districts.
Although the federal
lawyer/administrator can obtain certain raw counts on the
web site maintain by the
Executive Office for United States Attorneys, it does not
appear that more sophisticated
material – even simple per capita rates and percents
– is easily available to help
them be effective managers.
Information about how the law
works in practice can be a valuable component
of the practicing lawyer’s
advice. Governmental and other web sites are constantly
expanding the kinds of
operational and other information available and reducing
the burdens of accessing that
information. However, the sites usually stop short of
providing a full range of the
data-mining and other user tools that would enable
the practicing lawyer to
correctly assess the environment in which advice is to be
given.
TRAC is a work in progress. It
is continuously expanding both the kinds of data
mining tools provided, as well
as the breadth and depth of the data. As other sites
copy and build upon the TRAC
model, the range of available information is likely
to expand exponentially thus
increasing the impact on the practice of law. If data
about the summons, service,
and disqualification of grand and petty jurors becomes
available on line, for
example, could cases for discrimination at various points in
the process be made more
easily and cheaply than when the claimant had to pay
for the collection and
analysis of the data? Finally, could the new information even
create new clauses of action?
8. Conclusion
Information technology has had
and continues to have a huge impact on our work
and the way in which we do it.
The practice of law is no exception. Previously,
veteran attorneys have had to
rely on their experience with the workings of the legal
system to design strategies
for handling individual cases. While experience will
always play an important role
in this regard, advanced technology, which makes it
possible to access and use
immense stores of electronic data, can now offer another
alternative.
In this paper we have
presented a model for making previously inaccessible
information about the U.S.
federal government available to lawyers. The model,
as implemented by
Transactional Records Access Clearinghouse (TRAC), uses a
276 LINDA ROBERGE ET AL.
data warehouse and specially
designed data mining tools to provide access to a
wide variety and vast amount
of federal data. The media, public interest groups,
concerned citizens, and even
the government itself are already using TRAC’s web
site. Users need not be trained
data analysts to use the data mining tools, although
experienced analysts often
choose to use them because they are both powerful and
easy to use.
Despite the uncomplicated
point and click interface, lawyers and other subscribers
have sometimes experienced
problems making full use of TRAC’s system.
Problems range from knowing
where to click to envisioning how the information
could be useful. Currently,
TRAC staff members work with subscribers individually
to solve their problems.
However, this paper has proposed an expert system
that could automate the help
desk. The expert system, which is visualized as an
Automated Case Advisor, would
present information directly relevant to lawyers
as well as help them learn the
capabilities of the system.
As additional producers of
data adopt TRAC’s model, the amount of information
available is likely to
increase dramatically. As a result, attorneys undoubtedly
will witness changes to the
way their profession is practiced, how the law is administered,
and the types of research they
are expected to do. Cases that previously
would have required costly
collection and analysis of data will be brought more
easily. Advice that was
previously based on hunches alone will be grounded in
data. The overall records of
judges and prosecutors, now available for all to see,
will provide an image of the
workings of the U.S. federal system of jurisprudence
that would not exist without
information technology. In short, we can expect the
impact to be dramatic.
Note
1 Portions of this paper are
based on “Data Warehouses and Data Mining Tools for the Legal Profession:
Using Information Technology to
Raise the Standard of Practice” by Roberge, L., Long, S.,
and Burnham, D. forthcoming in
the Syracuse
Law Review,
Volume 52, Book 4.
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 277
Appendix A: Examples of user
interface for TRAC’s data mining tools
Figure 1A. Criminal express tool (white
color crime – declinations).
Figure 2A. Criminal express tool output
(white color crime – declinations).
278 LINDA ROBERGE ET AL.
Figure 3A. Civil going deeper tool.
Figure 4A. Fourth level table output from
civil going deeper tool.
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 279
Figure 5A. Analyzer tool – creating data
slice.
Appendix B: Automated Case
Advisor dialog (Module 1)
The sample dialog below
presents one of many possible interactions that a lawyer
could have with the Automated
Case Advisor. (For alternate paths, see the tree
structure in Figure 1 below.)
At the outset of the consultation, the Automated Case
Advisor would ask for the
focus that the lawyer would like. In all cases, a list of
choices would be presented to
the user. Where appropriate, the user may also enter
the choice of unknown.
Automated Case Advisor
questions Lawyer answers
Focus of Consult? Settlement
Negotiation
What type of case? Criminal
What stage? Referral
Which agency investigated? FBI
What is the nature of the case?
Health Care Fraud
What is the lead charge?
unknown
What is the referral district?
Penn, W.
Has the prosecution been filed?
No
Who is the prosecutor handling
the case? Jane Doe
Who is the judge? Unknown
Figure 1B. Sample dialog with automated
case advisor.
280 LINDA ROBERGE ET AL.
From here, the Automated Case
Advisor would prepare a report in the form of
an html page based on answers
that had been input. This allows the user to access as
much or as little information
as needed. From here it will also be possible to change
the input thus allowing the
lawyer to do “what-if” analyses with different scenarios.
Based on the example
interaction above, a report might include the information
shown in Figure 2 below.
Figure 2B. Case advisor top level tree
structure (Module 1).
TECHNOLOGY AND THE CHANGING
PRACTICE OF LAW 281
Figure 3B. Case advisor interactive
output.
282 LINDA ROBERGE ET AL.
References
Ballou, D. P. T. and Giri K.
(1999). Enhancing Data Quality In Data Warehouse Environments.
Communications of the ACM
42(1): 73–78.
Benander, A. Benander, B., and
Fadlalla, A. (2000). Data Warehouse Administration and Management.
Information Systems Management
17(1): 71–80.
Berry, M. and Linoff, G.
(1997). Data Mining Techniques. John Wiley and Sons, Inc.: New York.
Cox, G. (2002). Voir Dire: Those
Appealing Defendants. The National Law Journal 24(19): citing
Schwab, Eisenberg, and
Claremont. (forthcoming) University of Illinois Law Review,
Plaintiphobia.
Durkin, J. (1994). Expert
Systems Design and Development. Prentice Hall: Englewood Cliffs, NJ.
Dyche, J. (2000). e-Data:
Turning Data into Information with Data Warehousing. Addison-Wesley:
Reading, MA.
Friedland, L. (1998). Accessing
the Data Warehouse: Designing Tools to Facilitate Business
Understanding. Interactions,
25–36.
Hall, O. P. Jr. (2000). Mining
the Store. Journal of Business Strategy 22(2): 24–27.
Ignizio, J. (1991). An
Introduction to Expert Systems. Mc-Graw-Hill: New York.
Jarke, M. Lenzerini, M.
Vassiliou, Y., and Vassiliadis, P. (2000). Fundamentals of Data Warehouses.
Springer-Verlag: New York.
Kimbal, R. and Ross, M. (2002).
The Data Warehouse Toolkit, 2nd Edition. John Wiley and Sons:
New York.
MacDonald, M. and Oettinger, A.
(2002). Information Overload. Harvard International Review, 44–
48.
Mallach, E. (2000). Decision
Support and Data Warehouse Systems. McGraw-Hill: Boston.
Roberge, L. (1994). The Impact
of a Natural Language Interface on Barriers to Information Access.
Ph.D. diss., School of
Management, Syracuse University, Syracuse, New York.
Roberge, L. Long, S., and
Burnham, D. (2002). Data Warehouses and Data Mining Tools for the
Legal Profession: Using
Technology to Raise the Standard of Practice. Forthcoming in Syracuse
University Law Review 52: Book
4.
Toffler, A. (1970). Future
Shock. Random House, Inc.: New York.
Turban, E. and Aronson, J.
(2001a). Decision Support Systems and Intelligent Agents, 6th ed.
Prentice Hall: Upper Saddle
River, NJ.
Turban, E. McLean, E., and
Wetherbe, J. (2001b). Information Technology for Management. John
Wiley and Sons: New York.
Witten, I. and Frank, E.
(2000). Data Mining. Academic Press: San Diego, CA.
Retirado de: http://www.kluweronline.com