Internet Agents

Research and development of so-called Internet agents is one of the main areas of Actor Prolog application. Internet Agents are programs automating data collection and analysis that are made for solution of problems of particular user (or group of users). In other words, Internet agents sort with universal retrieval systems like custom-made programs sort with universal database management systems.

The Actor Prolog language implements the main advantage of logic languages in this area, namely that the ideology and the paradigm of Prolog programming (tree search with backtracking) almost exactly suit the hypertext structure of Web and user behavior during the search of information on Internet. Prolog is a powerful tool for text syntactic analysis, query language, and general purpose programming language at once. In addition Actor Prolog has operational semantics adapted for correct logic programming in dynamic environment of Internet. In this chapter we will consider the use of logical actors, concurrent processes, and some predefined classes of Actor Prolog for logic programming of the Internet agents.

Let us consider the Web_1.A example in the Web directory.

Example 1. Logical representation of HTML data.

In current implementation of Actor Prolog the 'Receptor' predefined class is a basic means of data retrieving from Internet. In this class there are predicates that let one retrieve data from Internet and convert them into various terms of Prolog. The example under consideration illustrates the use of the get_parameters, get_text, get_references, and get_trees predicates.

-------------------------------------------
-- An example of Actor Prolog program.   --
-- (c) 2002, Alexei A. Morozov, IRE RAS. --
-- Retrieving Web information.           --
-- Logical representation of HTML data.  --
-------------------------------------------
project: (('Web1'))
-------------------------------------------
class 'Web1' specializing 'Receptor':
--
location         = "http://www.cplire.ru";
max_waiting_time = seconds(12);
revision_period  = days(3);
attempt_period   = hours(1);
tags             = ["FONT","P"];

The slot location contains Internet resource address. It is allowed the use of HTTP and FTP addresses as well as local computer filenames. The slot max_waiting_time contains the maximum waiting time period of server response. If the time period exceeds this value, the resource will be considered temporarily inaccessible. The slot revision_period contains a time period during that Actor Prolog has to recognize possible change of given Internet resource. The slot attempt_period contains a time period in that Actor Prolog will try to connect to resource if this resource is temporarily inaccessible. The slot tags will be considered later, when we will discuss the get_trees predicate.

--
w1      = ('Report',
                title="Parameters",
                background_color='Yellow',
                x=0,
                y=0,
                width=40,
                height=10);
w2      = ('Report',
                title="Text of Resource",
                background_color='Cyan',
                x=40,
                y=0,
                width=40,
                height=10);
w3      = ('Report',
                title="Links of Resource",
                text_color='Magenta',
                x=0,
                y=10,
                width=40,
                height=15);
w4      = ('Report',
                title="Resource Structure",
                text_color='Blue',
                x=40,
                y=10,
                width=40,
                height=15);

These are text windows in that the information retrieved from Internet will be displayed.

--
[
goal:-!,
        Parameters== ?get_parameters(),
        w1 ? clear,
        write_parameters(Parameters),
        --
        Text== ?get_text(),
        w2 ? clear,
        w2 ? write(Text),
        --
        List== ?get_references(),
        w3 ? clear,
        write_references(1,List),
        --
        Structure== ?get_trees(),
        w4 ? clear,
        write_tag_structure(0,Structure).
--

write_parameters(entry(URL,D,T,_,S)):-
        w1 ? writeln("URL:  ",URL),
        write_date(D),
        write_time(T),!,
        w1 ? writeln("Size: ",S).
write_parameters(Error):-
        w1 ? writeln(Error).
--
write_date(date(Y,M,D)):-
        w1 ? writeln(
                "Date: ",Y,"-",M,"-",D).
--
write_time(time(H,M,S,_)):-
        w1 ? writeln(
                "Time: ",H,":",M,":",S).
--
write_references(N,[URL|Rest]):-!,
        w3 ? writef("%3d %s\n",N,URL),
        write_references(N+1,Rest).
write_references(_,_).
--
write_tag_structure(T,
                [block(Tag,List)|Rest]):-!,
        shift_text(T),
        w4 ? writeln("block \"",Tag,"\":"),
        write_tag_structure(T+1,List),
        write_tag_structure(T,Rest).
write_tag_structure(T,[Item|Rest]):-!,
        shift_text(T),
        w4 ? writeln(Item),
        write_tag_structure(T,Rest).
write_tag_structure(_,_).
--
shift_text(T):-
        T > 0,!,
        w4 ? write("   "),
        shift_text(T-1).
shift_text(_).
]
-------------------------------------------

This is the program output:



Fig. 1.1. Logical representation of data in HTML format.

Let us consider each window one by one.

In the "Parameters" window the program outputs resource parameters acquired using the get_parameters predicate; they are the following: resource URL, date and time of last resource update, and the size of the resource. The predicate has returned a complex term entry(URL, Date, Time, Attributes, Size). The value of the fourth argument of this term, Attributes, depends on the type of the resource and is not considered in this example.

In the "Text of Resource" window the program outputs a text in HTML format acquired from Internet using the get_text predicate without any additional processing.

In the "Links of Resource" window the program outputs the list of hyperlinks detected in specified resource. The list of hyperlinks is acquired using the get_references predicate. Note that all detected hyperlinks are given in full form. If it is necessary, then URL contains server address and catalog in which specified resource is located.

The fourth window is the most interesting one. In this window the program outputs data represented in the form of complex term hierarchy. The get_trees predicate used for data retrieving automatically converts the HTML text into trees. The get_trees predicate uses a list of names specified in the slot tags of the 'Receptor' class to determine what HTML tags should be recognized during text conversion. If a block which tag is included in the list tags is detected in the text, then a complex term block(Tag,List) is created, where Tag is the name of the tag (written with capital letters) and List is the block entry also represented in the tree form. Tags that are not included in the list tags are simply ignored. Hyperlinks detected in the text are converted into complex terms of the form reference(URL,List), where URL is the address and List is the hypertext entry represented in the tree form.

Note that the program execution did not end on this. According to the semantics of the language Actor Prolog responds to the correctness of results of program execution even if the input data changes in time. Therefore a duty of the program is to watch any possible changes of specified resource and update the output of the program. To stop the execution of the program press the button.

The support of model-theoretic semantics of programs running in dynamic environment is a basic feature and advantage of Actor Prolog over other logic languages. It is obtained with a special strategy of logical inference based on logical actors. The main details of this strategy are illustrated on the next examples.

Example 2. Retrieving information from several Web resources.

Let us consider the Web_3A.A example in the Web directory.

--------------------------------------------------------
--       An example of Actor Prolog program.          --
--       (c) 2002, Alexei A. Morozov, IRE RAS.        --
--       Retrieving information from several          --
--       Web resources.                               --
--------------------------------------------------------
project: (('Main'))
--------------------------------------------------------
class 'Main':
--
actor1  = ('MyReceptor',
                location=
                        "http://www.cplire.ru/Lab144/"
                        "space/prolog.html",
                x=10,
                y=0);
actor2  = ('MyReceptor',
                location=
                        "http://www.comlab.ox.ac.uk/"
                        "archive/logic-prog.html",
                x=25,
                y=7);
actor3  = ('MyReceptor',
                location=
                        "http://www.cetus-links.org/"
                        "oo_prolog.html",
                x=40,
                y=14);
--
[
goal.
]
--------------------------------------------------------
class 'MyReceptor' specializing 'Receptor':
--
location;
max_waiting_time        = 0.0001;       -- 12.0;
revision_period         = 5.0;
attempt_period          = 1.0;
--
x;
y;
con     = ('Report',
                title=location,
                x,
                y,
                height=11,
                width=30);
--
[
goal:-
        Parameters== ?get_parameters(),
        write_parameters(Parameters).
--
write_parameters(entry(URL,Date,Time,_,Size)):-
        con ? writeln("URL:  ",URL),
        write_date(Date),
        write_time(Time),!,
        con ? writeln("Size: ",Size).
write_parameters(Error):-
        con ? writeln(Error).
--
write_date(date(Year,Month,Day)):-
        con ? writeln("Date: ",Year,"-",Month,"-",Day).
--
write_time(time(Hours,Minutes,Seconds,_)):-
        con ? writeln(
                "Time: ",Hours,":",Minutes,":",Seconds).
]
--------------------------------------------------------

This program creates three instances of the 'MyReceptor' class. According to the semantics of Actor Prolog the goal actor is created in every of these worlds. This actor gets parameters of the specified Internet resource and prints them in the corresponding text window.



Fig. 2.1. Retrieving information from several Web resources.

To illustrate the work of logical actors I purposely assigned a very small value to the slot max_waiting_time of the 'Receptor' predefined class (actually it is recommended to set this value equal to a number of seconds, depending on channel capacity of using Internet channel). As a result the access to specified resources was artificially impeded. This has caused an exceeding of waiting time limit when addressing to the third resource; the get_parameters predicate has returned the error message lateness(MaxWaitingTime) instead of parameters, where MaxWaitingTime is the exceeded time limit of resource response.

The interpreter has memorized this failure and in some time automatically tried to access to the resource once again. The second attempt was successful. The interpreter has called repeated proving of the appropriate actor and printed retrieved parameters. Then the interpreter will continue watching the state of specified Internet resource; but a new repeated proving of concerned actor will be called only if the parameters of the resource really change or if the resource is temporarily inaccessible on next check-out.

Note that the repeated proving of the actor corresponding to considered resource did not affect on the proof of the two other actors. An advantage of the control strategy of Actor Prolog is a possibility of local modification of logical reasoning with saving of results which mathematical correctness was not violated as a result of environment change.

The results of the proof of two other actors in the considered example are the following. The first actor successfully retrieved parameters at first access to specified resource and printed them to the screen. The parameters of the Internet resource requested by the second actor were not obtained by the interpreter because the value max_waiting_time was set too low. However, if it succeeds in the future, the appropriate actor will be proven repeatedly and the parameters of the resource will be printed to the screen.

Unfortunately, such low level tools as text windows do not support logical semantics of a program. Practically it means that a programmer has to organize manually correct output of the information retrieved from Internet. If the programmer forgets to refresh data on the screen after repeated proof of an actor, obsolete information will remain on the screen. In order to Actor Prolog can guarantee the mathematical correctness of the whole information on the screen, it is necessary to use higher level output means such as modeless dialogs.

Example 3. Output of information retrieved from several Web resources.

Let us consider the Web_3A1R.A example in the Web directory. In this example the information received by logical actors is collected into a list with the help of an resident and output to the screen by the means of modeless dialog. Any change of the input data results in automatic refresh of the data on the screen.

--------------------------------------------------------
--       An example of Actor Prolog program.          --
--       (c) 2002, Alexei A. Morozov, IRE RAS.        --
--       Output of information retrieved from         --
--       several Web resources.                       --
--------------------------------------------------------
project: (('Main'))
--------------------------------------------------------
class 'Main' specializing 'Dialog':
--
identifier      = "output";
--
actor1  = ('MyReceptor',
                location=
                        "http://www.cplire.ru/Lab144/"
                        "space/prolog.html",
                x=10,
                y=1);
actor2  = ('MyReceptor',
                location=
                        "http://www.comlab.ox.ac.uk/"
                        "archive/logic-prog.html",
                x=20,
                y=8);
actor3  = ('MyReceptor',
                location=
                        "http://www.cetus-links.org/"
                        "oo_prolog.html",
                x=30,
                y=15);
--
target_worlds   = [actor1,actor2,actor3];
results         = target_worlds ?? get_data();
--
[
goal:-
        show.
]
--------------------------------------------------------
class 'MyReceptor' specializing 'Receptor':
--
location;
max_waiting_time        = 2.0;
revision_period         = 5.0;
attempt_period          = 1.0;
--
x;
y;
con     = ('Report',
                title=location,
                x,
                y,
                height=10,
                width=30);
--
parameters;
--
[
goal:-!,
        Parameters== ?get_parameters(),
        Parameters == parameters.
--
get_data= Line
        :-
        parameters := entry(URL,Date,Time,_,Size),
        Line== ?format(
                "%d\t%s\t%s\t%s",
                Size,
                ?time_to_string(Time),
                ?date_to_string(Date),
                URL).
--
date_to_string(date(Year,Month,Day))=
        ?format("%d-%02d-%04d",Day,Month,Year).
--
time_to_string(time(Hours,Minutes,Seconds,_))=
        ?format("%d:%d:%d",Hours,Minutes,Seconds).
]
--------------------------------------------------------

The dialog box definition is given below. To output the list of parameters the listbox element with the UseTabStops flag is used. The dialog element is directly connected to the slot results of the corresponding class instance.

grid(80,25)
dialog_font("Arial",12,[])
dialog "output" (
          "",default,default,default,
          centered,centered,default)
     vbox(center)
          "Parameters of Web sites:"
          listbox['UseTabStops'](results,0000,45,4,[],[])
          button(close,"&Exit")
     end_of_vbox
end_of_dialog

The parameters of the element listbox(Content, Selection, Width, Height, InitialContent, InitialSelection) have the following meaning:

  1. The Content is a program access code to element content: number, string, or slot name. In the considered example the slot results is used as the access code, therefore the value of this slot will automatically be output to the screen.
  2. The Selection is an access code to the listbox list of chosen strings. In this example a user choose of the list elements does not affect to the program execution, therefore this code is not used.
  3. The Width is approximate width of the element (a number of letters).
  4. The Height is approximate height of the element (a number of strings).
  5. The InitialContent is an initial content of the element, a list of strings.
  6. The InitialSelection is an initial selection of list elements, some list of strings.

The program will open on the screen the dialog box where the list of strings computed by the resident and passed to the slot results will be output.



Fig. 3.1. Output of information retrieved from several Web resources.

A disadvantage of considered example is that all three worlds in that the resident proves the get_data function belong to the same process. Therefore, if any one of three resources changes, then the interpreter probably will prove repeatedly all three actors together to simplify its work. To avoid additional calculations we will divide the program into several processes to assign a single process to each Internet resource.

Let us consider the Web_3P1R.A example in the Web directory.

Example 4. Retrieving Web information by concurrent processes.

The listing of this example is almost completely equal to the listing of the previous program. Only the constructors of the worlds 'MyReceptor' in the definition of the 'Main' class are changed (now, these are constructors of processes).

--------------------------------------------------------
--       An example of Actor Prolog program.          --
--       (c) 2002, Alexei A. Morozov, IRE RAS.        --
--       Retrieving Web information by                --
--       concurrent processes.                        --
--------------------------------------------------------
class 'Main' specializing 'Dialog':
--
identifier      = "output";
--
actor1  = (('MyReceptor',
                location=
                        "http://www.cplire.ru/Lab144/"
                        "space/prolog.html",
                x=10,
                y=1));
actor2  = (('MyReceptor',
                location=
                        "http://www.comlab.ox.ac.uk/"
                        "archive/logic-prog.html",
                x=20,
                y=8));
actor3  = (('MyReceptor',
                location=
                        "http://www.cetus-links.org/"
                        "oo_prolog.html",
                x=30,
                y=15));
--
target_worlds   = [actor1,actor2,actor3];
results         = target_worlds ?? get_data();
--
[
goal:-
        show.
]
--------------------------------------------------------

The background color of the dialog box was also changed.

grid(80,25)
dialog_font("Arial",12,[])
dialog "output" (
          "",default,default,default,
          centered,centered,green)
     vbox(center)
          "Parameters of Web sites:"
          listbox['UseTabStops'](results,0000,45,4,[],[])
          button(close,"&Exit")
     end_of_vbox

end_of_dialog

The program outputs the same results:



Fig. 4.1 Retrieving Web information by concurrent processes.

Note that in all considered examples a list of Internet resources was known before the beginning of the program run. In following examples methods of retrieving information from arbitrary number of Web resources will be considered.

Example 5. Retrieving information from arbitrary number of Web resources.

Let us consider the Web_N.A example in the Web directory. In this example with use of the get_references predicate the goal actor receives a list of links to Actor Prolog articles published on our Web Site. Then it passes this list as a parameter to the inspect_pages recursive procedure. The procedure processes the list of addresses, gets parameters of each resource using the get_parameters predicate and calculates the date of last resource update. The calculated date is passed through the slot max_date to the process 'Output' that outputs it to the screen.

--------------------------------------------------------
--       An example of Actor Prolog program.          --
--       (c) 2002, Alexei A. Morozov, IRE RAS.        --
--       Retrieving information from arbitrary        --
--       number of Web resources.                     --
--------------------------------------------------------
project: (('Main'))
--------------------------------------------------------
class 'Main' specializing 'Receptor':
--
location = "http://www.cplire.ru/Lab144/selected.html";
--
protecting: max_date;
--
dialog   = (('Output',
                location,
                max_date));
--
[
goal:-
        inspect_pages(?get_references,#).
--
inspect_pages([URL|Rest],Max1):-!,
        check_date(?get_parameters(URL),Max1,Max2),
        inspect_pages(Rest,Max2).
inspect_pages(_,max_date).
--
check_date(entry(_,Date2,_,_,_),Date1,Date2):-
        less(Date1,Date2),!.
check_date(_,MaxDate,MaxDate).
--
less(#,_):-!.
less(_,#):-!,
        fail.
less(Date1,Date2):-
        Date1 < Date2.
]
--------------------------------------------------------
class 'Output' specializing 'Dialog':
--
identifier      = "out";
recent_update;
--
location;
max_date;
--
txt             = ('Text');
--
[
goal:-!,
        show,
        check_date(max_date).
--
check_date(date(Year,Month,Day)):-!,
        recent_update==
                txt ? format(
                        "%04d-%02d-%2d",Year,Month,Day).
check_date(_).
]
--------------------------------------------------------

The inter-process communication of the program can be described with the following diagram:



Fig. 5.2. Inter-process communication diagram.

Here is the dialog box definition:

grid(80,25)
dialog_font("Arial",15,[])
padding(1.4,1.4)
dialog "out" (
          "",default,default,default,
          centered,centered,default)
     vbox(center)
          "Recent update of selected papers "

          "about the Actor Prolog:"
          table
               row
                    column
                         "Date"
                    end_of_column
                    column
                         text['AlignCenter']
                              (recent_update,7,1,"")
                    end_of_column
               end_of_row
               row
                    column
                         "Web Page"
                    end_of_column
                    column
                         text['AlignLeft']
                              (location,22,1,"")
                    end_of_column
               end_of_row
          end_of_table
          button(close,"&Exit")
     end_of_vbox
end_of_dialog

The results of the program are the following:



Fig. 5.1. Arbitrary number of Web resources.

The disadvantage of this scheme is that all Internet resources are accessed from the same actor. Therefore a change of any resource will result in repeated proving of this actor and repeated access to all other resources. To avoid additional calculations it is possible to use another scheme, which will be considered in the next example.

Example 6. Using direct messages.

Let us consider the Web_ND1R.A example in the Web directory. An inter-process communication in this program follows the scheme:



Fig. 6.1. Using direct messages.

The process 'Main' gets a list of resources that has to be checked, and sends some direct messages to the process 'CheckPages'. Each direct message invokes the check of one resource. The parameters of the resource are put into a database. The resident owned by the process 'Output' retrieves a list of values from the database and sends it to the modeless dialog.

The result of the program work is a list of Actor Prolog publications on our Web Site http://www.cplire.ru/Lab144 with the dates of the last update:



Fig. 6.2. Collected information.

Let us consider the program listing in detail.

--------------------------------------------------------
--       An example of Actor Prolog program.          --
--       (c) 2002, Alexei A. Morozov, IRE RAS.        --
--       Retrieving and output of information from    --
--       arbitrary number of Web resources.           --
--------------------------------------------------------
project: (('Main'))
--------------------------------------------------------
class 'Main' specializing 'Receptor':
--
location = "http://www.cplire.ru/Lab144/selected.html";
--
receiver = (('CheckPages'));
--
out      = (('Output',
                data_source=receiver));
--
[
goal:-
        send_references(?get_references).
--
send_references([URL|Rest]):-!,
        receiver << check(URL),
        send_references(Rest).
send_references(_).
]
--------------------------------------------------------
class 'CheckPages' specializing 'Receptor':
--
location        = "";
--
registry        = ('Database');
--
[
goal.
--
check(URL):-
        store_parameters(URL,?get_parameters(URL)).
--
store_parameters(URL,Parameters):-
        registry ? retractall(page(URL,_)),
        registry ? append(page(URL,Parameters)).
--
get_data= Line
        :-

        registry ? find(page(_,entry(URL,Date,_,_,_))),
        Line== ?format(
                "%s\t%s",
                URL,
                ?date_to_string(Date)).
--
date_to_string(date(Year,Month,Day))=
        ?format("%d-%02d-%04d",Day,Month,Year).
]
--------------------------------------------------------
class 'Output' specializing 'Dialog':
--
identifier      = "out";
--
data_source;
--
results  = data_source ?? get_data();
--
[
goal:-
        show.
]
--------------------------------------------------------

Here is the dialog box definition:

grid(80,25)
dialog_font("Times New Roman",14)
dialog "out" (
          "",default,default,default,
          centered,centered,'Cyan')
     vbox(center)
          "Selected papers about Actor Prolog:"
          listbox['UseTabStops'](results,0000,28,4,[],[])
          button(close,"&Exit")
     end_of_vbox
end_of_dialog

In this example the information processing scheme is not logically pure because of store temporary results of computations in the database (the 'Database' predefined class was used the operations of that have no model-theoretic semantics).

In addition, according to the Actor Prolog semantics the special check actors created during direct message handling disappears right after the end of the check message handling. Therefore Actor Prolog will not watch possible changes of the Internet resources. The use of direct messages may sometimes break the logical semantics of a program in itself as well.

Nevertheless the considered scheme is practically useful and often used in programs for automation of single search operations. Simplicity and effectiveness are its advantages. However, in order to create mathematically rigorous algorithms it is necessary to use another schemes based on concurrent processes, residents, and flow messages.

Example 6. Retrieving information from arbitrary number of Web resources by concurrent processes.

Let us consider the Web_NP.A example in the Web directory. This example differs from other previous in that a special process is used to retrieve information from each separate resource. Because of arbitrary number of sources of information, processes are created dynamically as needed. To implement dynamical process creation a new language feature which we did not consider yet, namely the so-called suspending process ports, is used.

--------------------------------------------------------
--       An example of Actor Prolog program.          --
--       (c) 2002, Alexei A. Morozov, IRE RAS.        --
--       Retrieving information from arbitrary        --
--       number of Web resources by concurrent        --
--       processes.                                   --
--------------------------------------------------------
project: (('Main'))
--------------------------------------------------------
class 'Main' specializing 'Receptor':
--
location = "http://www.cplire.ru/Lab144/selected.html";
--
target_list;
--
chain    = (('ReceptorChain',
                suspending: target_list));
--
[
goal:-
        check_list(?get_references).
--
check_list([]):-!.
check_list(target_list).
]
--------------------------------------------------------
class 'ReceptorChain':
--
target_list;
rest_of_list;
--
chain    = (('ReceptorChain',
                suspending: target_list=rest_of_list));
--
location;
--
receiver = (('CheckPage',
                location));
--
[
goal:-
        check_list(target_list).
--
check_list([location]):-!.
check_list([location|rest_of_list]):-!.
]
--------------------------------------------------------
class 'CheckPage' specializing 'Receptor':
--
con     = ('Console');
--
[
goal:-!,
        write_parameters(?get_parameters),
        con ? nl.
--
write_parameters(entry(URL,Date,Time,_,Size)):-
        con ? writeln("URL:  ",URL),
        write_date(Date),
        write_time(Time),!,
        con ? writeln("Size: ",Size).
write_parameters(Error):-
        con ? writeln(Error).
--
write_date(date(Year,Month,Day)):-
        con ? writeln("Date: ",Year,"-",Month,"-",Day).
--
write_time(time(Hours,Minutes,Seconds,_)):-
        con ? writeln(
                "Time: ",Hours,":",Minutes,":",Seconds).
]
--------------------------------------------------------

Inside the process 'Main' the process 'ReceptorChain' is created. The feature of the 'ReceptorChain' class is that it is recursive, i.e. inside the world 'ReceptorChain' a new instance of the 'ReceptorChain' class is created and so on. This lets the program create unlimited number of processes, one for each Internet resource. Certainly, a program using recursive class may get caught in an endless loop. However, due to the suspending port (slot) target_list this will not happen in our example.

One should use the suspending keyword to declare a process port suspending one. Suspending port automatically switches appropriate process to the "unused" state if the slot is unbound variable or if its value is equal to the # constant, and automatically switches it to execution state in other case. Under the "unused" state of a process we mean the absence of any computations in the process; process does neither handle nor send messages. From the point of view of program declarative semantics, unused process is absent, like there is no appropriate process constructor. The created process gets a list of resources, checks the first resource of the list, and assigns the tail value to the slot rest_of_list.

The parameter target_list containing a list of Internet resources that have to be checked is passed to the constructor of the process 'ReceptorChain'. Then suspending port lets the creation of appropriate process. The tail of the list is passed to the next constructor of the process 'ReceptorChain'. Thus a recursive chain of processes each of that checks its resource and prints resource parameters to text window is created. The last of created processes 'CheckPages' does not assign a value to its slot rest_of_list and creation of the chain of processes stops.

In order to check parameters of a resource an auxiliary process 'CheckPage' is created within each process 'ReceptorChain' and gets the resource address. The created chain of processes can be described graphically in the following terms:



Fig. 6.1. Recursive chain of processes.

The last process in the chain is drawn in dotted line which means the "unused" state. Notice that unlike previous the program remains mathematically rigorous in the Internet dynamic environment. In other words if the number of resources changes, new processes will be created or processes created before will be switched off. All changes on Internet will invoke repeated proving of corresponding actors and printing of the information to the screen.



Fig. 6.2. Information retrieved by the recursive processes.

The only disadvantage of considered program is that a low level tool (namely text window) that has no logical semantics is used to print retrieved information. In the next example a resident and a modeless dialog are used for output of the information.

Example 7. Mathematically rigorous retrieving and output of information.

Let us consider the Web_NP1R.A example in the Web directory. We will start discussion of the example with a scheme of information processing.



Fig. 7.1. Scheme of information processing.

The basic element of this example is recursive definition of chain of the processes 'ReceptorChain'. However, unlike previous program in this example the retrieved information is output to modeless dialog. For that the additional parameter previous_receptors (in which a list of auxiliary processes 'CheckPage' retrieving information from Internet is accumulated) is added to constructors of the processes 'ReceptorChain'. The process 'ReceptorChain' located at the end of the chain passes accumulated list of auxiliary processes to the process 'Output' that controls the dialog box.

--------------------------------------------------------
--       An example of Actor Prolog program.          --
--       (c) 2002, Alexei A. Morozov, IRE RAS.        --
--       Retrieving and output of information from    --
--       arbitrary number of Web resources by         --
--       concurrent processes.                        --
--------------------------------------------------------
project: (('Main'))
--------------------------------------------------------
class 'Main' specializing 'Receptor':
--
location = "http://www.cplire.ru/Lab144/selected.html";
--
target_list;
--
chain    = (('ReceptorChain',
                target_list,
                previous_receptors=[]));
--
[
goal:-
        check_list(?get_references).
--
check_list([]):-!.
check_list(target_list).
]
--------------------------------------------------------
class 'ReceptorChain':
--
suspending: target_list;
rest_of_list;
--
chain    = (('ReceptorChain',
                target_list=rest_of_list,
                previous_receptors=
                        [receiver|previous_receptors]));
--
previous_receptors;
--
location;
--
receiver = (('CheckPage',
                location));
--
all_receptors;
--
out      = (('Output',
                suspending: all_receptors));
--
[
goal:-
        check_list(target_list,all_receptors).
--
check_list([location],[receiver|previous_receptors]):-!.
check_list([location|rest_of_list],_).
]
--------------------------------------------------------
class 'CheckPage' specializing 'Receptor':
--
suspending: location;
--
parameters;
--
[
goal:-!,
        parameters == ?get_parameters.
--
get_data= Line
        :-
        parameters := entry(URL,Date,_,_,_),
        Line== ?format(
                "%s\t%s",
                URL,
                ?date_to_string(Date)).
--
date_to_string(date(Year,Month,Day))=
        ?format("%d-%02d-%04d",Day,Month,Year).
]
--------------------------------------------------------
class 'Output' specializing 'Dialog':
--
identifier              = "out";
background_color        = 'Yellow';
--
all_receptors;
--
results  = all_receptors ?? get_data();
--
[
goal:-
        show.
]
--------------------------------------------------------

The resident that proves the get_data predicate in all processes of given list is defined in the process 'Output'. The output of the program is the list of values results extracted from the processes 'CheckPage'. This list is output automatically in the dialog box the definition of that is given below:

grid(80,25)
dialog_font("Times New Roman",14)
dialog "out" (
          "",default,default,default,
          centered,centered,default)
     vbox(center)
          "Selected papers about Actor Prolog:"
          listbox['UseTabStops'](results,0000,25,4,[],[])
          button(close,"&Exit")
     end_of_vbox
end_of_dialog

Here are the results of the program:



Fig. 7.2. The results of the program.

Thus, we have created the program that in mathematically rigorous way retrieves and outputs to the screen information from arbitrary number of Web resources. The information from different sources is processed independently due to the concurrent processes. Any change of found resources or even change of their number (which can happen in the future) invokes automatic modification of the computation and output of the updated results to the screen.

Table of content