Anthony Mandic wrote:
> Can you describe in a little more detail, the
> exact nature of your ultimate problem or what
> what sort of processing is going on that lead
> you to this "solution"? Chances are that there
> may be a more elegant solution that bypasses
> this entirely.
Sure, but this may take a while.
This database (and our project) deals with student transfer
between the California community colleges, University of
California system, and California State University system.
There are about 160 public institutions of higher education
in California, and there are agreements in place (called
"articulation agreements") which specify how courses taken
at one institution can be used at another institution when
a student transfers. An articulation agreement can be a
lengthy and complex document -- some agreements between
two schools are 300 pages long.
Ultimately, the basic "unit" of articulation is called an
"instance." An instance can be very simple, such as
MATH 1 here can be used in lieu of MATH 101 there. An
instance can also be complex, with six courses on one side,
nine on the other, and half a dozen restrictions about how
they can be used.
To complicate matters further, we need to keep track of how
things change over time. For example, an instance contains
a foreign key to a course. If I display an articulation
agreement for the 99-00 academic year, I need to know that
the course was called "MATH 1" in that year, but if I
display the same agreement for the 97-98 academic year, I
need to know that the same course was called "MATH 11A" at
that time. The course attributes that can change over
time (independently) are its name, title, unit value, UC
transferability, CSU transferability, lab component and
cross-listing in multiple departments. In addition to
courses changing over time, institution information (name
and term type), departments, course prefixes, majors and
articulation instances all change over time while
maintaining their identity.
An articulation agreement can be viewed by department or
by major. In either case, it's the same instances, but
the organization is different.
If you want an example of what one of these looks like,
here is a small one -- between American River Colleage
and UC Santa Cruz for 97-98 for the Business Management
Economics major:
http://www.assist.org/cgi-bin/7g31.pl?event=19&pgid=Page4_aa&kind=3&o...
(sorry if that line wrapped)
The general site, from which any agreement can be
displayed is: http://www.assist.org
Okay, the tables involved, with approximate row counts,
are as follows. These aren't all of the tables in the
database, just the primary ones used to produce the
information for an articulation agreement:
a_art_aa: One row for each articulation agreement (10,000
rows)
a_art_instance: One row for each instance of articulation
(3 million rows)
a_major: one row for each major at each institution (3,000
rows)
a_major_hist_namechange: historical name change information
for majors (400 rows)
a_art_aa_instance_major: Represents ternary relationship
between agreements, instances and majors (6 million rows)
a_art_aa_instance_signoff: represents approval dates for
instances as they apply to agreements (4.5 million rows)
a_art_instance_r_course: represents the many-to-many
relationship between instances and courses on the receiving
side (2.5 million rows)
a_art_instance_s_course: represents the many-to-many
relationship between instances and courses on the sending
side (1.5 million rows)
a_art_instance_r_text: One instance can have many pieces
of text that can appear in addition to or instead of
receiving courses (eg: "No Course Articulated" in the
sample I gave) (750,000 rows)
a_art_instance_s_course: same as the above for the
sending side (1.5 million rows)
a_articulation_footnote: one row for each footnote
associated with an instance (200,000 rows)
a_art_instance_footnote: Represents many-to-many
relationship between a_art_instance and
a_articulation_footnote (500,000 rows)
a_art_instance_comment: one row for each comment
associated with an instance (700,000 rows)
a_course: One row for each course (500,000 rows)
a_course_hist_name: tracks course name changes over time
(20,000 rows)
a_course_hist_title: tracks course title changes over
time (15,000 rows)
a_course_hist_units: tracks course unit changes over
time (5,000 rows)
a_course_hist_uc: tracks UC transferability changes
over time (3,000 rows)
a_course_hist_csu: tracks CSU transferability changes
over time (4,000 rows)
a_course_hist_lab: tracks lab component changes
over time (500 rows)
a_course_sameas: tracks cross-listing of courses
(20,000 rows)
a_department: one row for each department at each
institution (11,000 rows)
a_dept_hist_namechange: tracks historical name
changes for departments (500 rows)
a_course_prefix: one row for each course prefix
(eg: MATH, BIOL, HIST) at each institution (15,000
rows)
a_dept_course_prefix: tracks many-to-many
relationship between departments and course
prefixes (14,000 rows)
a_institution: one row for each institution (160
rows)
a_inst_hist_namechange: tracks historical name
changes for institutions (10 rows)
a_ay_date: one row for each academic year (50
rows)
a_term_date: one row for each term (200 rows)
Okay, if you're still with me, to display an
articulation agreement by major for one particular
major (like the example you saw), you need to know:
The sending instituion
The receiving institution
The academic year
The major
To find out the instances involved in an agreement
by major, knowing the above, you need to join:
a_art_aa
a_institution (twice)
a_art_instance
a_art_aa_instance_major
a_art_aa_instance_signoff
a_major
Then for each instance, you need to:
1. Get the info for all the sending courses
involved in that instance. The info must be
current of the academic year specified.
2. Same for receiving courses
3. Same for sending & reciving text
4. Same for comments and footnotes.
So for each instance, you need to do a join of:
a_art_instance
a_art_instance_s_course
a_art_instance_r_course
a_course (twice)
a_art_instance_s_text
a_art_instance_r_text
a_art_instance_comment
a_art_instance_footnote
a_articulation footnote
And for each course in each instance, in order to
get the right set of attributes, you need to join:
a_course
a_course_hist_name
a_course_hist_title
a_course_hist_units
a_course_hist_uc
a_course_hist_csu
a_course_hist_lab
a_course_sameas
a_course_prefix
a_dept_course_prefix
a_department
a_dept_hist_namechange
Once you have a correctly formatted instance with
all the time-correct information, you put all the
instances together in order, add extra things at
the agreement level, and you're done.
We have perl code (using sybperl) to put together
an agreement. Given the schools, year and major,
it gets the agreement-level info, figures out the
orderd list of instances to appear in the
agreement, then for each instance, formats the
individual instance. The instance formatting
code does a database call to get the course info
for all the courses for that instance, formatts
them, and returns the formatted info back to the
agreement-formatting code, which, once it's
iterated over the entire list of instances, puts
it all together and returns it for display.
So just counting instances and courses, and
ignoring the rest, let's say there are 20
instances in a specific agreement for a specific
major. The database calls then are:
One call for the agreement-level info.
One call to get the instances.
For each instance (there are 20 of them)
One call to get the sending course info
One call to get the receiving course info
So that's a total of 42 database queries. We
want to change it to the following:
One call for the agreement-level info.
One call to get the instances.
One call to get ALL the sending course info
One call to get ALL the receiving course info
Then iterate over the three result sets, rather
than what we currently do, which is to iterate
over the one instance result set and do individual
calls for the course info for that instance. This
would involve a total of four database queries,
which is a lot better than 42.
Okay, the procedure to get the list of instances
is non-trivial, and takes time to run. So we
only want to run it once. Once we have the
ordered list of instances, we want to pass it to
the "get ALL sending course info" and the "get
ALL receiving course info" procedures. But this
means passing these routines a SET of instance
ID's rather than just one at a time. That's why
I was asking for an efficient way of passing a
result set to a stored procedure.
Sorry about being so long-winded, but you asked.
:-)
I hope that made sense.
Larry Coon
University of California
la...@assist.org
and lmc...@home.com