| Related sites for http://users.belgacom.net/bruno.champagne/db.html |
| Patrick,_Strater Resume. Web design, flash examples. Links. | | Digital_Equity Focuses on the design, development, and implementation of intranet and extranet systems. Based in Irving, Texas. | | X-Micro_Technology Manufacturer in design and distribution of professional 3D graphics cards worldwide. | | RFC_2934 Protocol Independent Multicast MIB for IPv4. K. McCloghrie, D. Farinacci, D. Thaler, B. Fenner. October 2000. | | Mess_Music Offers Drum Fix, a tool designed to rescue poorly recorded live drum tracks. | | Cole_Enterprises Specializing in construction accounting, estimating, job cost, and project management software. | | Ulex A lexical analyzer generator for Unicon. | | How_to_Make_an_A4_Poster_Using_the_Gimp The tutorial is composed of five lessons each broken up into particular areas of gimp expertise. The tutorials are designed for newcomers to the Gimp, and to Linux. By Kester Clegg. | | Winstation_Systems_Corporation Manufacturer of commercial solid state storage devices. | | TaBazar_Tablature_Editor Multitrack notation program for fretted instruments and percussion. View, edit, print, and play back scores. | | Simply_Neat_Software Video Poker, and SMTP Server (C# and VB.NET). | | Ullix,_Inc_ Web solutions for employee performance management and 360 feedback surveys. | | Free-To-Try_com More than 14000 manually accepted shareware and freeware downloads arranged in 150 categories, daily updated content, accepting PAD submissions from software authors. | | DesktopTuners Provides wallpapers such as Lego, Playmobil and space, as well as sounds for mail and computer. | | Incompatibilities_Between_ISO_C_and_ISO_C++ Thorough listing of incompatibilities between ISO C 99 and ISO C++ 98. An incompatible C feature in this context is valid as C code but not as C++ code. | | Websites_Services Provides web design, content and hosting facilities. | | Commercial_Webs,_Inc_ Offers design, hosting, eCommerce, and maintenance services. | | In_Dust_Real_Plus Collection of smiley graphics for use in chatting online. | | Alpaca_Cam Provides a view of the outdoor feeding station in the alpaca's field. | | Avantec_Managed_Care Offers managed care software to aid in processing medical claims, authorizations, capitation and provider payments. |
|
Make a documents database
Linux Index
Introduction
Emacs intro
Installation after-cares
Redhat bugs fixes
file systems
X Windows
Kernel
Config tools
RPM intro
Internet connection
Email configuration
Tcl programming
Best-of softwares
Linux Resources
M$ Windows
Documents database
C memento
Auto shutdown
Make a documents database
Introduction
Many people classify their files in a directory structure. Suppose you have
many documents and you want to be able to find them back easily and quickly. You need
a search engine able to find a document based on its keywords, title, author name,
kind of document, ...
The purpose of this section is to explain, step for step, a very appropriate solution
to find any document stored somewhere in a directory structure very fast.
Give your search criterion in a web form and you get a list of matching documents. Click on one
of the documents in the list and it's open !
This solution is multi-platform, free and flexible. It's just an assembly of renowned software
with small scripts around them to clue the whole thing. The basic ingredients of this receipt are :
Tcl/Tk scripting language
Mysql database
Apache web server
PHP scripting language
Your favorite web-browser
If many users need to access your database, you have to install the software only once on
a central computer. Each user only need a web browser to access the database from anywhere on the
network. To prevent any abusive use, the user need to give a password before he can access the
documents.
The story in an example
Suppose you have a directory 'Documents' containing the following files :
Documents
 phones.txt
 Project1
  hf38_specs.pdf
  kg76_specs.pdf
  report.doc
 Project2
  hf_meas.xls
  letter.doc
For each file you want to see in your database,
you will add a description file with the extension '.nfo' as follow :
Documents
 phones.nfo
 phones.txt
 Project1.nfo
 Project1
  hf38_specs.nfo
  hf38_specs.pdf
  kg76_specs.nfo
  kg76_specs.pdf
  report.nfo
  report.doc
 Project2.nfo
 Project2
  hf_meas.nfo
  hf_meas.xls
  letter.nfo
  letter.doc
The description files contain informations about the corresponding document.
Only the first four letters of the field name are significant.
It may be :
titl : the title of the document
keyw : keywords
auth : the name of the author(s)
crea : the creation date
proj : the name of the project
type : the kind of document (for example : calculation, measurement, document, script ...)
refe : a reference or a document number
All fields are optional but you should at least fill the field 'titl'.
In the previous example, we could have :
phone.nfo
title : phone book
keywords : phone
author : Bruno Champagne
type : document
project : none
Project2.nfo
title : Project no 2
project : project2
hf_meas.nfo
title : HF measurements
keywords : HF jitter
type : measurements
author : Antonio Soubri
letter.nfo
title : research contract
keywords : KGF contract
type : document
author : Mona Moors
In place of 'author', you may just write 'auth'.
Don't forget to create the '.nfo' file for directories (even an empty file)
otherwise sub-directories won't be scanned.
A Tcl script will scan the directories and try to find any '.nfo' file and the
corresponding document.
The script will fill the database (here an SQL database). In the table below, you see what
the database entries will look like (only a few columns are shown) :
titlekeywordstypeproject
phone bookphonedocumentnone
Project no 2--project2
HF measurementsHF jittermeasurementsproject2
research contractKGF contractdocumentproject2
............
This table also contains the name of the author, a reference and
the location of the file.
Actually, it is slightly more complicated. We suppose we have a limited
number of types and projects. So, the real database contains 3 tables :
a documents table is the same than described above expected that the
columns 'type' and 'project' are filled with an index in place of a string. The index
points to an entry in another table type or project
the project table has two columns : the first is an index and the second
is a string containing the name of the corresponding index
the type table has two columns : the first is an index and the second
is a string containing the name of the corresponding type
But those are details you don't need to care about ...
Fields values may inherit from a parent directory :
type may inherit from a parent directory if not specified for the current file
project may inherit from a parent directory if not specified for the current file
the keywords of the parent directories are added to the keywords of the current file
We can connect to the search engine with a simple web browser.
If we configure Apache so that our files are 'a restricted stuff',
the user is first prompted for his/her login and password.
Then we get a search form such as the following :
Search document
project
any project
none
project1
project2
title containing
keywords
reference
document type
any type
document
measurement
dirname contains
filename contains
authors contains
As you can see in the form above, the possible values for 'project' and 'type' are automatically
filled from the scanned files.
After you've typed your search criterion, you get a list of matching documents :
Search results
title/keywordsauthor(s)type/fileproject
phone bookphoneBruno Champagnedocumentphone.txtnone
research contractKGF contractMona Moorsdocumentletter.docproject2
The title of the document is also a hyper-link to the document. If your browser has the appropriate plug-ins,
one click one the title is enough to open the document.
Apache/PHP setup
Download Apache at http://httpd.apache.org.
Install it. Make a directory where you will put your documents. For example, make 'C:/html'.
Download PHP at http://www.php.net/. Install it.
In the configuration file of Apache, httpd.conf, change the setting 'DocumentRoot'
to point to the directory containing your html files.
For example,
DocumentRoot "C:/html"
You also need to specify the directories to be served and the
corresponding 'alias' names. For example if you want to be able to
access to the directories 'C:/documents_project1' and 'C:/personnal_docs',
add the following lines in the file httpd.conf:
alias "/project1" "C:/documents_project1"
alias "/family" "C:/personnal_docs"
In the same file, check the section included between
'<Directory />' and '</Directory>'. It should look like this :
<Directory />
Options FollowSymlinks
AllowOverride None
AuthName "restricted area"
AuthType Basic
AuthUserFile "c:/pass.txt"
require valid-user
</Directory>
In the example above, we specify that any user has to identify himself before he can access
to the documents. The file containing the user login/password is (in this example) named 'c:/pass.txt'.
You also need to say Apache where to find the PHP interpretor.
For example, if the interpretor is 'C:/PHP/PHP.EXE', then you need to check that
the following lines are present in the file http.conf :
ScriptAlias /php/ "c:/php/"
AddType application/x-httpd-php .php
Addtype application/x-httpd-php-source .phps
Action application/x-httpd-php "/php/php.exe"
Define a new user. Go into the bin directory of Apache in a console. (for Windows users,
this directory should be 'C:/Program files/Apache Group/Apache/bin'). Type
htpasswd -c c:/pass.txt username
where you should replace 'username' by the name of the user you wish to add.
To add a second user, type
htpasswd c:/pass.txt username2
where you should replace 'username2' by the name of the user you wish to add.
Start the Apache server.
Tcl + SQL library
Download Tcl/Tk 8.3 from
http://dev.scriptics.com/software/tcltk/download83.html.
Install it.
Now you need a library to allow tcl to access Mysql.
Download fbsql at http://www.fastbase.co.nz/fbsql/index.html.
Windows users: install the dll file in the bin directory of tcl.
Unix users: follow the instructions ...
Scripts installation
Download the following zip file scripts.zip
and install its contents in the root directory of the Apache
server. In our example, install the files in the directory 'C:/html'.
You will find the following files :
index.html : the first file Apache will open when you connect. It just contains a link to
'search.php'. But it's up to you to make a more attractive site ...
search.php : php script containing the search form
results.php : php scripts that shows the result of the search
initdb.tcl : prepare Mysql for the documents database
makeindex.tcl : scans the directories and fills the database. Every hour, it
will update the database in background.
Mysql setup
If needed, download Mysql at http://www.mysql.org.
Install it.
Start the server :
Under windows, if you don't see any traffic light on the corner of the screen, click on
'C:/mysql/bin/winmysqladmin' and
start the server by clicking with right key of the mouse on the red light and selecting 'start server'
Under Linux, log as 'root' and type : safe_mysqld &.
Before you start using the database, you need to create the grant tables
(which determines who can connect to the database). So type :
mysql_install_db
Now we need to prepare Mysql for our documents database. Execute the script 'initdb.tcl'.
It tries to connect to the Mysql server.
If needed, it prompts for a new password for the user 'root'. 'root' is the administrator
user of Mysql database (don't confuse with the administrator of Unix machines!).
It creates a new user 'db_user'. The password is 'db_pass'. This user can only connect
locally.
It creates a new database 'documents_db' and all the needed tables. One of them
('scan_dirs') contains a list of directory to be scanned. The 'initdb.tcl' script inserts one entry
in this table : 'documents' (it's the default name of the directory where you will put your documents).
Restart the Mysql server.
Changing the directories to be scanned
The script 'makeindex.tcl' mentioned above has to know where are the directories to be scanned.
If you only want to scan the directory 'documents' (or more precisely, 'C:/html/documents'),
you don't need to change anything (because it is the default setting).
Suppose you want to scan the directories named 'C:/documents_project1' and
'C:/personnal_docs' (see Apache setup). Start Mysql in the console :
(Windows users : start MS DOS, go into the directory C:/mysql/bin ; Unix user : no problem)
mysql -udb_user -p
When prompted for password enter 'db_pass'. Then,
use documents_db;
To see the list of the scanned directories, type
select * from scan_dirs;
You will only see the directories 'documents'. To suppress this first element of this list, type
delete from scan_dirs where id=1;
Now, you can insert the new entries.
insert into scan_dirs values(null,
'C:/documents_project1','project1');
insert into scan_dirs values(null,
'C:/personnal_docs','family');
As you can see, for each directory you add, you also need to add its alias name
(the same than specified in the setup of the Apache server). This needed to take care that
the search results are linked to the right web address.
Try it !
At this stage, the database should be fully operational.
First of all, you should create the directory where you want to place your documents.
For example, create 'C:/documents_project1'. Place a few documents in this directory and
make the corresponding '.nfo' files. You may also create sub-directories but don't forget
to create a '.nfo' file for each sub-directory (even an empty '.nfo' file is OK).
Start the script 'makeindex.tcl'. It will run in background and update silently the database every
two hours.
Start your favorite web-browser. As address, type 'http://127.0.0.1' if you are working without
network or enter the address of your computer (or the one where the server is running) if you are
working on network. Now you should see the prompt form for login and password. Enter the
user name and password you have defined as described above. Click on the link 'search document'.
You should see
the form 'Search document'. Click on 'Submit' and you will see a list of all the indexed documents.
Remark: the use of this database is not limited to documents. You really can use it
for anything. For example, if you want to make a database of your friends, you can for example
make a html file for each of them where you enter any information you want. You can even place
a picture. Or you can just make a scan of their name-card. Make the corresponding '.nfo' file.
Auto-logon and auto-startup when using Windows computer
This section is only applicable for Windows users. Whereas it should be easy to do
the same job on an Unix computer, I've not yet tried this.
First of all, if you have NT computer, you can configure the Apache Web server and
the MySQL database as 'services' so they are automatically started at startup.
Secondly, to be able to access the network drives, you need to logon. To be sure
the same login is used every time, the simplest solution is to use auto-logon. Click
on the 'Start' button, 'Run...', then type 'regedit.exe'. Select the path
'HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Windows NT/CurrentVersion/Winlogon'. Define the
following entries as string :
AutoAdminLogon, value "1"
DefaultUserName
DefaultDomainName
Defaultpassword
Thirdly, to start the script 'makeindex.tcl' automatically after each login, go to
'HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/
Windows/CurrentVersion/Run' in the registry. Define
a string entry, for example "shutdownscript" and give it as value the location of the makeindex
script, for example "c:\html\makeindex.tcl".
If you also want to shutdown automatically at a fixed time, refer to the chapter
'auto-shutdown'.
|
|