To Banq, about XML tranformation, Web Portable management

Jevang發表於2003-04-04
Hi Banq,
Haven’t talked with you for a while, how is going?
Recently I have been busy with a project design, got some questions, I think your input would be helpful, of course, comments from all the folks in this board are
highly appreciated

The first phase of the project is about data warehouse, I am more care about the ETL part, i.e. how to get data in varies formats from heterogynous systems into a single centralized common model. The number of systems to be integrated is unpredictable in short term.
To make the development effort manageable, I want to divide the implementation into two major parts:
1> DW center, common model
2> An agent, or adaptor that will be loaded on external systems

DW center part:
Unless direct DB to DB transformation is available and efficiency is critical, I expect the majority of data collection is done thru XML, so I just discuss XML handling here

DW will be running on top of AppServer and Topas, Upon receiving incoming XML synchronously or asynchronously, Topas will convert XML data into topas Business objects, it will trigger a set of business rule declared in Topas meta model or users’ java code, validation and data cleansing is done at this stage, further down the road, valid input java object will be persistent to DB.

First step is define the common data model in XML files, they are the meta model, the cornerstone of the whole development; Database schema and Topas Business Object will be generated from those XML files.

Here the input XML from external is already in standard format; its specification(XSD or DTD) is basically derived from the meta model XML,( I will discuss a more complicated case later) so almost everything here is automated after the meta model, if no customized rule applied( for sure there will be some, maybe a lot), no code is required as Topas build-in XML unmarshalling and data persistent service will take care everything.

The only uncertainty here is what kind of wire protocol I should prepare to use, ideally JMS is the most reliable and its async; but I want to open the door for external systems that’s not have dedicated TCP/IP connection to center or maybe even not java systems.

So JMS queue with Message in XML format is definitely a option, alternative is a SOAP service, which can basically take a SOAP message send thru HTTP( browser or other client) or SMTP, unpack the SOAP message, get the inside XML data and pass to Topas for process.


My biggest concern is the adaptor for external system; assume I have provided such as adaptor to run inside other parities’ system, for sure it need highly flexible and customizable, now I just try to quantify the amount of work thru my imagination

I have discussed about the plumbing, so now just focus on how to convert a variety of schema structure into xml data that is conformed to common spec.

I can take the similar approach as DW center, start to model the Database in XML, or simply reengineer the XML from database schema. Once this done, I can use Topas to automatically pull the data out of database and marshal them into XML according to the previously define XML meta model; now the task left is how to convert the format from source XML to the common XML format.
The answer is XSTL, I need to define a XSTL spec, it will be feed to XSTL processor(Xalan e.g) together with the source XML, the output will be a XML message in common format and is ready to be sent out thru JMS, if thru SOAP, another layer of wrapping has to be performed.

Topas’s scheduling service can be configured to control the frequency of data upload.


In summary, for each external system, I need to repeat the same exercise once, hopefully just some minor tweak. Customization is hopefully done at XML spec, minimum code change when porting from system to system.

Problem with this solution is it might be an overkill in some circumstances, for a small system, it may not need to run a app server and Topas, maybe a hard code translating from db record to common XML message is even more straight forward. Or, maybe the source system has no RDBMS, the data sit somewhere else. Excel, XML file or others.

My first question to you: Do you know any other light weighted DB2XML or TheRestOfWorld2XML framework can be applied here? I know you have experience with Castor, can it be used here?

Topas XML marshalling/unmarshalling is another area to be enhanced. Think about it, each spec for common XML may not just contains one object or say table, it can possible contain multiple records and record types to represent one complete information unit. For example, a customer order will contain customer, address, order, line item and maybe product detail. This complicated XML spec has to be predefined but not generated from meta-model which is only applicable to single table. Right now Topas maintain a home grown structure to represent this nested embedment and using SAX/DOM parsing, I am afraid I have to constantly update this component if the syntax is not rich enough to accommodate more information.

Second question, have you see a good XML spec that people use to represent a complicated object model, it should also has contain notions like whether it’s about insert, update or mark the objects invalid ( as delete may not be allowed here)
Long long time back, I study BizTalk, there are some standard used in certain industry, but I feel it’s too overwhelming to understand. I want something simply.

I also reluctant to choose a pure RPC model: try to categorize the activities and define methods for each of them. Like typical web service, AddCustomer, UpdateShipInfo, etc
Maybe I should check out XQuery when I got time.


Second Phase, in my view, it’s an ASP model, which is building and hosting applications for my clients. Majority functions in client apps are similar so some clients may share at runtime app level or code level. I am very interested to know about your automated Web portal system. But due to different target users, I am less focus on front end customization but need greater logic control capability than what you have so far, actually some folks here are make good points about workflow engine, I think that maybe required to archive my objective, do you have any plan to integrated one? OFBiz or OSWorkflow sounds quite popular.

Another question, can I easily replace your default Bean with customized one to do more process?

Does your portal provides statistic report?

I played the site you direct me for the estore. like it.
Just some personal opinions, on-line retail market has been saturated as entry level is low technically and from business side. You have a great idea and possibly well thought implementation about many things: simple UI,Page gen; Privilege control, basic flow control etc. This technology would be much valuable if you apply them into a new section, which has tougher entry in term of business and technology, I believe there are some, such as in financial service area.

Looking forward to hear from you.

Cheers
-Wanchun




相關文章