Design question on enforcement of rules on objects created after parsing a binary file.

Design question on enforcement of rules on objects created after parsing a binary file.

Post by Vika » Thu, 17 Jul 2003 22:57:35



I am working on an application to be written in C++ which has to read
a binary file and parse it to get the data out of it. The file format
is Integrated Product Message (IPM) which contains message type,
elements and sub elements in it.

Now the elements in the file and their value depends on message type
and one element's value may be dependent on whether some other element
is present and what is its value.

For example there can be following kind of rule I will have to
enforce:

A sub element 0005 should be present in element 48 with value 10, 15
or 20 for a message type 1600, if the element 73 value is 700.

I am trying to figure out how to enforce these kinds of rules. I want
to keep the rules independent of the elements and keep the enforcement
of the rules flexible as in future new rules may be added or removed
for the message type and its elements. I will appreciate if you guys
can give your input in designing the application.

Thank you for your replies.

Vikas

 
 
 

Design question on enforcement of rules on objects created after parsing a binary file.

Post by Bruno Desthuillier » Fri, 18 Jul 2003 02:10:43



> I am working on an application to be written in C++ which has to read
> a binary file and parse it to get the data out of it. The file format
> is Integrated Product Message (IPM) which contains message type,
> elements and sub elements in it.

> Now the elements in the file and their value depends on message type
> and one element's value may be dependent on whether some other element
> is present and what is its value.

> For example there can be following kind of rule I will have to
> enforce:

> A sub element 0005 should be present in element 48 with value 10, 15
> or 20 for a message type 1600, if the element 73 value is 700.

Almost a minilanguage in itself...

Quote:> I am trying to figure out how to enforce these kinds of rules. I want
> to keep the rules independent of the elements and keep the enforcement
> of the rules flexible as in future new rules may be added or removed
> for the message type and its elements.

<my-2-cents-just-thinking-out-loud>
I don't know for sure, but I'd think of a configurable state-machine,
with a rule-language to configure it. You then feed the state machine
with the tree you got from parsing the IPM file.

This way you have a clean separation of the mechanism (the
state-machine) and the policies (the rules), and can change the rules
without changing any code.
</my-2-cents-just-thinking-out-loud>

Bruno

 
 
 

Design question on enforcement of rules on objects created after parsing a binary file.

Post by JXSter » Fri, 18 Jul 2003 02:28:29


On Wed, 16 Jul 2003 17:10:43 +0000, Bruno Desthuilliers


>> A sub element 0005 should be present in element 48 with value 10, 15
>> or 20 for a message type 1600, if the element 73 value is 700.

>Almost a minilanguage in itself...

>> I am trying to figure out how to enforce these kinds of rules. I want
>> to keep the rules independent of the elements and keep the enforcement
>> of the rules flexible as in future new rules may be added or removed
>> for the message type and its elements.

><my-2-cents-just-thinking-out-loud>
>I don't know for sure, but I'd think of a configurable state-machine,
>with a rule-language to configure it. You then feed the state machine
>with the tree you got from parsing the IPM file.

>This way you have a clean separation of the mechanism (the
>state-machine) and the policies (the rules), and can change the rules
>without changing any code.
></my-2-cents-just-thinking-out-loud>

And if this is any kind of an industry-standard message, some bright
boy has probably already built a message parser/builder object.  

Let us google ...

http://www.internetnews.com/ec-news/article.php/10793_1561181

"Purchase, N.Y.-based MasterCard's global platform was built on the
International Standards Organization's (ISO) 8583 message set. About
97 percent of all card-issuing banks are currently using the new
Integrated Product Message (IPM) format platform, which allows banks
to process clearing transactions up to six times a day vs. a single
batch method once a day."

well, there's this, but I guess I'll leave the detailed searching to
you.

http://www.alaric-systems.co.uk/mapper.html

J.

 
 
 

Design question on enforcement of rules on objects created after parsing a binary file.

Post by Vika » Fri, 18 Jul 2003 07:16:33



> On Wed, 16 Jul 2003 17:10:43 +0000, Bruno Desthuilliers

> >> A sub element 0005 should be present in element 48 with value 10, 15
> >> or 20 for a message type 1600, if the element 73 value is 700.

> >Almost a minilanguage in itself...

> >> I am trying to figure out how to enforce these kinds of rules. I want
> >> to keep the rules independent of the elements and keep the enforcement
> >> of the rules flexible as in future new rules may be added or removed
> >> for the message type and its elements.

> ><my-2-cents-just-thinking-out-loud>
> >I don't know for sure, but I'd think of a configurable state-machine,
> >with a rule-language to configure it. You then feed the state machine
> >with the tree you got from parsing the IPM file.

> >This way you have a clean separation of the mechanism (the
> >state-machine) and the policies (the rules), and can change the rules
> >without changing any code.
> ></my-2-cents-just-thinking-out-loud>

> And if this is any kind of an industry-standard message, some bright
> boy has probably already built a message parser/builder object.  

> Let us google ...

> http://www.internetnews.com/ec-news/article.php/10793_1561181

> "Purchase, N.Y.-based MasterCard's global platform was built on the
> International Standards Organization's (ISO) 8583 message set. About
> 97 percent of all card-issuing banks are currently using the new
> Integrated Product Message (IPM) format platform, which allows banks
> to process clearing transactions up to six times a day vs. a single
> batch method once a day."

> well, there's this, but I guess I'll leave the detailed searching to
> you.

> http://www.alaric-systems.co.uk/mapper.html

> J.

Thanks for your replies. I found a JAVA library called jPOS for ISO
8583 but didn't get anything for C++. I guess I have to do the hard
way, i.e., code it.

Vikas

 
 
 

Design question on enforcement of rules on objects created after parsing a binary file.

Post by H. S. Lahma » Sun, 20 Jul 2003 01:53:28


Responding to Vikas...

> Thanks for your descriptive post. I was thinking in the same direction
> but your post gave me a concrete idea. But I am still not sure about
> the object model for the specification objects, how they will be
> created and their association with element objects. I am providing
> more information on the IPM format and the rules that I need to apply
> that will give a better idea. I will appreciate it if you can provide
> further comments.

Before getting into details, the general pattern...  The basic idea is
for the context object to implement a quite generic behavior that is
modified parametrically by the attributes of the specification object.
IOW, one abstracts some invariant commonality in the context object and
leaves the detailed variations as data.  So the specification class is
usually a dumb data holder.  The rules are implicitly enforced by
instantiating the association between the context object and the correct
specification object.

Note that XML is an example of this sort of thinking.  XML abstracts the
invariant fundamentals of the relational data model.  That allows an
object that knows how to process an XML string to deal with any data
structure.  Thus XML is useful for Factories when instantiating object
instances based upon external configuration data.

In your situation the basic idea is to think about how one might express
the IPM and other relevant rules in a small suite of common
specifications.  As you indicate, the rules for the IPM organization to
extract elements seem to be pretty basic.  The tricky part will lie in
specifying the validation of element values.

> The IPM message type gives what kind of message is in the file. Its a
> 4 byte numeric data at the start of the file. Then there are two 8
> bytes data called bitmaps that give information on which elements,
> named Data Element(DE), are present. The DEs are in ascending order
> and they can contain subfields or collection of subfields, i.e., same
> subfield multiple times. The subfields themselves are in ascending
> order. Few DEs whose element number are known beforehand can have
> special subfields called Private Data Subfields (PDS), which
> themselves can have subfields or their collection. DEs and the
> subfields are recognized by their position that can change based on
> whether other DEs are present or not while the PDSs are recognized by
> their tags.

I think I've got everything up to the last sentence, but just to be sure
I understand...  There's a bit in the bitmaps for each possible DE so
DEs are numbered 0-127 (or 1-28).  If the bit is set, that DE is
present.  A similar mechanism is used to define the subfields in DEs
(i.e., the leading two bytes of the DE will have a bitmap to indicate
the subfields present).

For a suite of DEs whose numbers are predefined (PDSes) the subfields
themselves can have subfields.  Presumably a similar mechanism is used
for the subfields to identity their subfields.

I still need a couple of clarifications:

(1) I assume that subfields of PDS subfields cannot have their own
subfields(?)

(2) I assume that by 'change' in the last sentence you mean that the
value of the data is modified, not the position(?)

(2) I assume from the stuff below that you need to validate that the
buffer has nor been corrupted or you have application rules that are
expressed in terms of element dependencies in the buffer.  If not, could
you put some more words around the last sentence?

> An IPM file is in following format. I have put spaces between elements
> for clarity.

> XXXX xxxxxxxx xxxxxxxx
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> Msg  Bitmap1  Bitmap2  Rest of the file with DE containg subfields or
> PDS

> If a DE or PDS is variable length then there is field containing the
> length of that element. Some properties of the elements are known at
> compile time and some only at run time.

> Static properties - Type (alpha, numeric, binary etc.), Fixed or
> variable length, composite (for DE and PDS) etc.
> Dynamic properties - element present or not, set of values it can have
> etc.

> I found that I generally have to apply following three dynamic rules.

> 1. Check for presence or absence or elements:
> - if element(s) are present or absent or element(s) have certain
> values then other element(s) should be present or absent
> - element(s) are present or absent if other element(s) are present or
> absent or other element(s) have certain values

> I don't know if the above two cases can be combined into one.

I would think about this somewhat differently.  I see two tasks: parsing
the buffer to determine what is there and validating that what is there
is correct.  (Clearly, if it is possible that a bitmap bit got dropped,
you have to be able to detect that you are parsing gibberish as soon as
possible, which would require parsing and validation to be essentially
concurrent.)

> 2. Validation of element values - this can be against another element
> value or range of values or some wildcard. The comparison can be =, >,
> < or !=.

> 3. Validation of combination of values - validate that two or more
> element values have certain combination.

Just to clarify...  So far this sounds like you must (a) create the
buffer, (b) validate that it hasn't been corrupted, and/or (c) create
application artifacts (attribute values, objects, relationships, etc.)
or doing special processing based upon dependencies among elements in
the buffer.

I assume we can rule out (a).  Among the remaining it is important to
know which it is because (b) implies concurrency with parsing while (c)
does not and that would be reflected in the object model.

- Show quoted text -

> I am thinking that the rules will have to be applied in a particular
> order. For example presence or absence needs to be tested first,
> followed by validation of element values and then comination rule.

> Parsing an IPM file is not a problem. I know exactly which DEs are
> present based on the bitmaps, PDS based on their tags and the
> subfields based on the DE or the PDS property. I can also easily do
> validation of static properties for an element but it is the dynamic
> properties for which I am not sure how to enforce the rules.
> Furthermore, the application has to flexibly add or delete a rule for
> a particular element. The validation is also complicated by the fact
> that new elements may be added in future and the application should be
> able to handle them. For example, currently the application is not
> reading DE 41. But in future it may have to, and that DE will have
> certain rules. How to write the application so no or minimum coding
> changes are required?

I agree that parsing the IPM file seems straight forward.  One can
probably do that without specification objects by just distributing the
responsibilities among different classes.  For example, one might have a
Bitmap class that handles creating a list of element numbers.  The
instance of bitmap associated with MessageCode or DE would do the same
processing for the bitmap values in either context.

[If the size of the bitmaps is different for MessageCode and DE, then
that would be a candidate for external specification.  But in this case
it would probably be easier to just set it in the constructor since
whoever creates the instance would know the context when it instantiates
the relationship.]

Alas, I think I need more words around the validation rules before
getting too specific about details.  But it may actually be helpful from
the viewpoint of identifying invariants.  B-) The first thing to note is
that what you do is different than what you test.

For example, testing values may determine which elements to expect or it
may just result in ignoring an element like DE 41.  IOW, in the context
of IF <condition> THEN <body> ELSE <body> one wants to specify
<condition> separately from <body>.  So let's look at what <condition>
is about.

Basically <condition> will be a logical expression involving {element
number, value} pairs.  For instance,

({41} > {12}) OR ({82} = 0)

In this text specification the numbers are the element identifiers
(ignoring compound identity for {DE, subfield}).  Now the same parser
can process any arbitrarily complex text expression by accessing the
correct element values and "walking" the expression.  [Whether this
parser wants to be in a special class or not depends upon how complex
the expressions can be.  If there won't be any AND/OR possibilities one
probably doesn't need a dedicated parser class.]

Now the specification object only needs three attributes.  One for the
condition string and one each for the THEN/ELSE body strings.  (Assuming
one can abstract the actions to be taken in a similar manner.)

A mechanism for specifying the body strings requires more detailed
knowledge of the rules.  But so far you have described three types of
activity:

(1) determine what elements should be present or not present.

(2) determine what some element value should be.

(3) determine some application processing.

(1) and (2) should be fully expressible in terms of element identifiers
and values and the most complicated parsing would be a list.  (3)
becomes trickier.  In the case of DE 41 it might be as simple as setting
some flag attribute.  But, in any case, writing some application code is
unavoidable.

The trick for (3) is to capture the processing in a method and
parameterize as much as possible in input arguments.  If the processing
is complex and there are several contexts, one might want to look at
design patterns like State and Strategy.  The key is that there is some
simple identity scheme where the identifier in the body attribute can be
associated with a particular activity in the code.

[I've talked about attribute strings because they can be quite general
for things like logical expressions.  However, there is nothing to
...

read more »

 
 
 

1. creating binary random access file from a text file

[html header deleted--mod]

I want to create a random access file from a data text file.  The file
is in the following format:
    Paul    0    1    2    CR    1.3
    Jane    0    3    4    OB    2.4
    Mark    1    4    5    AR    5.2
I am creating a class:
    class data {
        char name[20];
        int a;
        int b;
        int c;
        char location[8];
        double d;
    };
How do I read this into another file which I want to be binary and
random access?  I know I need to use filename.write, but then I get
lost!!!!  HELP!! ( im frustrated)  :(

[html section deleted--mod]


      [ about comp.lang.c++.moderated. First time posters: do this! ]

2. Problem with Tags

3. Pls help, how to create a formatted binary data file?

4. Wireless Router

5. Problem reading a binary file created by Matlab

6. Best Imaging Software

7. How do I create binary registry merge files ?

8. NDIS driver on XP over 1394 conflicts with other 1394 driver

9. VB problem: creating binary files! help!!

10. Design question: CORBA Object Implementation and Domain Object Models

11. Creating objects within a Class vs Creating objects on the Stack in Main

12. Reading Binary Files as Binary " not" stringformat......

13. Object-Oriented approach to parse a file