rbUpdate small problem

An educated guess is that FireDAC has a problem with FB re UTF8. In other words; FireDAC+FB+UTF8 when setting params with overwritten DataType. The debugger shows the problem (see first post).

Hi Dany,

Sorry for the delay. Juggling.

FireDAC docs firmly state that they support unicode with .AsString and that there should be no need to use ‘widestring’ variations on the params nor field accessors. I checked https://blog.marcocantu.com/blog/embarcadero_buys_anydac.html to verify the original author – Dmitry Arefiev. I would bet that Dmitry Arefiev tested plenty of Russian-text examples. Personally I have found that if there are going to be UnicodeString bugs, using Russian sample text is the quickest way to find them.

I think something else is wrong, e.g. config of the connection or possibly the field definitions.

My experience, using both IBObjects and FireDAC with Firebird, was always that UTF8 support with IBObjects was effortless in contrast to requiring extreme care with FireDAC. I will have to go back to my notes to figure out what those details were.

My thought was that we could all talk about the Recipe content because it is public, and everyone should be able to see the same data. I did a lot of verification of the translations within FireDAC+SQLite. Today @Lajos verified the translations in the *.fdmem files, which are TFDMemTable streamed to binary - these are my “master files” for the recipe sample. All the Latin-1 translations including Hungarian and Spanish text looked clean as viewed in a VCL grid control. The russian data however was garbage and I don’t know yet whether that was a font issue or a grid issue or what.

Once I am certain that we have .fdmem clean, then I can review the process of getting that data into Firebird 2.5 and 3.0, and THEN we can reproduce the problem with a much-more-known dataset. That will not make your life easier, admittedly. It will make it possible for us to chase the problem all the way to the roots however.

After we duplicate the problem with recipe, or not, then (to me) it would make sense to comment on your Document database.

I do apologize for the glacially slow pace at the moment. There is a lot going on in the world, in my family life, etc. Just understand that the delay does not imply lack of appreciation for your observations & insights.

NB: please do not download railaccident.fdb nor recipe.fdb directly, even though that is possible at the moment. The RailAccident sample only has english content. I’m moving everything toward a standard where the *.fdmem files are downloaded and then moved into the target DBMS, precisely to eliminate Unicode loss.

More news within 48 hours, most likely.

Ann

That sound like a good strategy. Libraries like Rubicon can gain a lot from TDD or similar strategies.

All good!

And what does TDD stand for ?

Test Driven Development.
Usually unit tests (a pattern i am not very fond of).

:slight_smile: I call that DUnitX. A necessary pattern and insufficient for full Rubicon testing.
I have been trying to increase the amount of consistent testing across bridges and compilers. When you svn checkout the /trunk, you get to see all that, for better or for worse.

In other news, Lajos verified that our .fdmem files for the recipe sample are fine, even the Russian content. (It was a font issue confusing the story yesterday.)

I will get those cleanly into Firebird 2.5 which I have very handy, and then we can double check with FB 3.

More news when I have it.

Status, using latest-code:
I see Russian translations of the recipe categories and text that looks valid when inspecting FB_2_5 with Jason’s IB_SQL utility.

I did several experiments with FireDAC batchmove today. To get some speed during testing, I added a feature for importing just a subset of the data.

Per release notes for upcoming Rubicon v4.072, RbcResetSampleData v3.103 utility ( download from here ) takes a command-line parameter which lets us filter the recipe content. Instead of importing more than 100,000 records of RecipeText and more than a million Ingredients, you could, for example, just import the first 100, all of which have been translated. For many situations, that will be enough for testing.

Example BAT syntax to start it with a filter:

start RbcResetSampleData.exe /Filter="RecipeBaseNo <= 100"

If you need more data, you can filter <= 1000 or whatever…

Summary of what’s different in v3.103 of the utility:

  • command line can set the Filter, only applicable to Recipe sample so far.
  • CommitCount is 1000 instead of 100, although that did not make much of a difference in speed testing.
  • Target TDataSet remains TFDQuery because testing of TFDTable revealed that it was orders of magnitude slower once I imported more than a few thousand records into any given table.

Almost-full source to the utility is available via svn. I already committed it through to the rubicon repo available to customers. You can at least read through the FireDAC TFDBatchMove syntax there if you wish. Let me know if you try to compile it and have any trouble.

@lajos will verify with FB_3_0 and then we can get back to the very necessary work of incrementally changing the WORDS index.

PS. Yes, it feels somewhat strange to keep coding and writing as if the world powers are not fighting. As I have zero direct influence, I attempt to keep paddling my small part of the boat forward.

Hi Dany
I tested RecipeText with FB_3_0 yesterday, with the 100 record limit indicated above by Ann.
Since the RT_Description field is a blob subtype_1, the same as the search_content_text and search_meta_text fields in your Documents table, do you mind if we adjust your contributed MVP to go forward with RecipeText instead of your Documents table?
Regards: Lajos

I can not see why i would have a problem with that!

Just do not publish complete texts where a context can be seen if you want to copy some strings from the MVP DB.

Thanks Dany. Next step: see whether we duplicate the problem just using RecipeText for the first 100 recipes which have been translated into the full range of lingvos. That subset is quick for repeat testing. More news when we have it.

Your patience is greatly appreciated.

Status (pls see question at the end)

  • @lajos made a repo for support ticket 00442 with your project source. FL3_MakeFTSIndexVCL.dpr is there, in /trunk/support00442 You have access.

  • If we all svn checkout that repo URL under r_suite_inactive_samples then the search paths and other details should work fine for everyone regardless of which stage of rubicon svn repo they are using.

  • This is what one wants to end up with :
    r_suite_inactive_samples\support00442\FL3_MakeFTSIndexVCL.dpr

This is essentially what svn info shows me :

Working Copy Root Path: D:\rubicon\r_suite_inactive_samples\support00442
URL: https://svn.riouxsvn.com/(snip)/trunk/support00442

Yes, this is an unconventional way to nest svn trees.
Yes, it will work, especially for what will hopefully be a relatively temporary support question.

I see that the SQL is in the DFM of the form. That’s okay. We can adjust it to suit RecipeText. There’s no generator on the RecipeText table in the FirebirdSQL version of the database at the moment. We’ll have to select max(RecipeTextNo) to get the MaxIndex. I guess we’ll have to add a table settings_general, admittedly far more multi-user realistic than using an INI file to remember the current maximum.

Question: What does (doc.extracted = 1) imply within your document system? Does extracted mean has-been-indexed-by-Rubicon and/or something else?

Ann

Just to make things clear, i have submitted two “bug reports” and some suggestions for features.
I do not think i have any open support ticket.

Extracted indicates that the indexable text has been extracted from a source (usually a document). As records are added they go through more than one processing. So my rmBatchAdd adds the locations that have been changed from 0 to 1 since the last time rbBatchAdd was ran.

The MVP is extremely stripped from functionality :slight_smile: the MVP is old already i wrote it when i was under the impression rbAppen would be usable.

Status…

I adjusted the MVP so that it works with the Recipe sample data. There’s a main menu, to be used once at the beginning, to reset the Firebird database. It will create the generator (aka sequence), and the settings table. It makes a separate table, RecipeTextExtra, for the Extracted field, just to make a clear line between the usual sample data and the extra fields for this project.

This is progress but not yet enough to test the matters of importance.

Understood and appreciated.

Best,
Ann

Hello. I am “back” now. I had to visit family during April. I survived the travel and myriad adventures, remained covid-negative; am now back to work.

Ok, I see that now in the CommandText property of the TFDCommand which executes under the “Before Append” button. I am simulating your situation using

update RecipeTextExtra r
set r.extracted = 1
where r.RecipeTextNo > 173512

The 173512 number is arbitrary.

select count(recipeTextNo) from RecipeTextExtra where Extracted = 1

That leaves us with 234 records to process.

I remain perplexed about the purpose of the local FMax variable. I see

procedure TForm2.GetMinMax;

which first assigns FMax based on the generator, so ok, that represents the max existing primary key. However it then overrides that with the maximum extracted location, which in my simulated data is null because initially nothing has been processed, indexed, nor marked extracted. Quite probably my simulation (setting extracted = 0 for all records initially) is wrong ? I have been pondering what you meant by this,

I don’t see how time factors in. Extracted is an integer of 0 or 1. What tells you when it moved from 0 to 1?

And one question about the FireDAC syntax in TForm2.GetMinMax in the MVP.

var
  oDat: TFDDatSTable;
//
  FDCommandMaxExtracted.Open;

  // Why not use FDCommandMaxExtracted.Define in the next statement? 
  // Doesn't it matter which TFDCommand gets connected on the Define line? 
  oDat := FDCommandGetCurrent.Define; 

  FDCommandMaxExtracted.Fetch(oDat);
  FMax := oDat.Rows[0].GetData(0);  // returns NULL on recipe data thus far

I will send you an email off-list with the subversion checkout URL so that you can see the code in full.

Hopefully the above is easy for you to answer, even after this very long delay.

Best,
Ann