A Great Wall of Scholarship |
1999.11 |
China's "Complete Library" put
on CD-ROM |
By Dan Gillmor |
In 1772, the Chinese Emperor Qianlong decided it was time to bring
all written human knowledge under one roof. He asked - in the way
that emperors "ask"- his subjects to submit all their books
for the collection he wanted to create.
An army of scholars, copyists, clerks and others went to work. They
cataloged tens of thousands of books and re-crafted them into 3,460
works under four classifications: Jing (Classics), Shi (history),
Zi (Philosophy) and Ji (Literature).
The pages of what is known as Siku Quanshu - Complete Library in
Four Branches of Literature - use standardized grids and characters,
and everything is catalogued through summaries, author biographies
and the like. To call it the biggest work every is almost to trivialize
it. With 4.7 million pages and 800 milllion Chinese characters, Siku
Quanshu is awesomely grand, a Great Wall of scholarship.
It took 10 years to complete the first set, and another eight years
to make six copies in the original style. Only three of the originals
have survived various foreign invasions and other calamities, and
they're housed under lock and key in museum vaults.
Now, thanks to digital technology, Siku Quanshu is more widely available.
Digital Heritage Publishing Ltd. (www.skqs.com), a Hong Kong based
company, has turned the enormous collection into an equally enormous
database, adding the kinds of tools that will enhance scholarship:
searching, annotation, hyperlinking and much more.
Everything about the electronic Siku Quanshu says quality. The CD-ROMs
and printed documentation, for example, come in color-coded boxes
os plush they might contact champagne or jewelry.
The software is equally fine, with beyond-faithful reproduction
and display. You can see what amounts to a digital photograph of the
original, and a then look at an extremely similar rendition that incorporates
all of the database features such as searching. One especially nifty
tough is a vital magnifying glass - move it across the screen and
individual characters almost leap out of the text for deeper analysis.
Gabriel C.M. Yu is chairman of IT Ventures Ltd., which controls
Digital Heritage. ITVentures spans many different businesses, including
an online seller of books and ethnic products called Chinese Books
Cyber Store Ltd. (www.chinesebooks.net), which Yu and his colleagues
hope to turn into an Asian Amazon.com.
But I stopped by his offices, high in a skyscraper overlooking Hong
Kong's fabled harbor, mostly to hear about the Siku Quanshu. I'd been
told about it by the fried of a friend in Silicon Valley - and it
was everything I'd heard.
Yu takes justified pride in the project. He's spend about $8 million
so far and isn't sure when or if he'll make back his investment through
sales of the database, which spans more than 180 CD-ROMs and costs
thousands of dollars per copy. Libraries are the major customers so
far.
He doesn't seem to care much about direct return on this investment,
for several reasons. For one thing, the project has been as much a
labor of love as a business deal. But the technology his team created
for this project seems likely to pay back many times over.
Some of the techniques are quite complex, largely because of the
language. The electronic Siku Quanshu was developed using an extremely
comprehensive Chinese character set. It's a based on Unicode, a system
under which characters from many languages can be represented in ways
on which international standards bodies agree.
Unicode includes traditional and simplified Chinese characters,
reflecting both ancient and modern writing forms. The Siku Quanshu
character set includes thousands of other Chinese characters, a total
of more than 32,000, but it remains based on Unicode. This remains
based on Unicode. This means the electronic version can be displayed
on computers running different language version s of Windows, and
it opens the possibility of someday running the product on other computing
platforms.
The product also incorporates a powerful search engine, among many
other useful tools developed for this project. The engine recognizes
traditional and simplified characters and senses related words, returning
better results. Yu hopes to turn that work into a second-generation
Chinese Internet serach engine. The scanning and optical-character
recognition technology developed for this project also could be put
to use creating digital libraries of the future, Yu believes.
Digital Heritage ran the project, but it had a lot of help. Among
the contrbutors were an army of modern scholars and researchers. Development
partners included major universities in China, and the company has
lined up publishing partners for inside and outside China.
Will there be an online version? Maybe, but not right away. Yu noted
the recent decision by the Encyclopedia Britannica to move online,
but he's in no rush to do the same with this work. For on thing, Britannica
had competition from Microsoft, which created the lower-quality but
much cheaper Encarta encyclopedia. Siku Quanshu is one of a kind,
and the business case for another electronic version is questionable.
Yu's business instincts are obviously acute. He saw the flowering
of the Information Age earlier than most people here, and he has been
building some remarkable enterprises.
But I had the feeling he'll be happiest to be remembered for what
he and his team have done with Siku Quanshu. Framed commendations
from the top museums in Taipei and Beijing hold prominent positions
on his office wall.
Any Yu, when I asked him why he'd embarked on this multi-year project,
talked about contributing to his culture, to the researchers who'll
be able to far more learning about the past that they could ever have
done with the original volumes.
"We have changed people's ideas" about what is possible,
he said.
Maybe we could all change our minds, or at least evolve them a bit.
These techniques and others will help us collect and make better use
of the knowledge we're accumulating and creating today. Emperors may
start such projects, but it is all of humanity that needs to use them.
|