Data Mining and SCADA

Want to find some useful patterns in your SCADA data?

I’ve found WEKA to be a great tool in this regard.  Among other things, it is capable of classifying, clustering, visualizing results, and time-series forecasting (with a free add-in extension).  

As an example problem, a project for a client required fairly accurate (1-hour) resolution predictions of when influent flows into a treatment plant would be over 300 MGD, based solely on current and past rainfall data.

Using WEKA, I’ve been able to take hourly rain-gauge data, and predict what future hours will be over that flow threshold, assuming that the forecasted future rainfall is semi-accurate.

If this piques your interest, I’d suggest the following starting point from IBM –

If you’re interested in time-series forecasting, check out this guide from Pentaho –

Wonderware Terminal Server Memory Optimization

If you’re looking to speed up deployed InTouch applications, one of the best ways is to turn on window caching.  At a client site, we chose to turn on high-priority caching with 50% RAM available, and turned on visited window caching with 40% memory availability.  High priority windows included overview screens with many objects, as well as the trending and alarming windows.  The client has six Terminal Servers running as virtual machines, each hosting between 15-30 thinclients.  More RAM was also allocated to each of these (Wonderware recommends a minimum of 1GB of ram per ‘view.exe’ session expected to run on a machine).  To verify that the upgrade was appropriate, we needed to determine if the increased RAM was enough to handle the increased memory load from window caching.

One simple metric to flag if a Windows machine has enough memory is page reads/sec.  Regardless of all other information, if a machine reports over 5 page reads/sec for an extended period of time, it likely has a memory shortage  (This and a number of other useful counters are described by Mark Edmead and Paul Hinsberg at Microsoft Technet, including items useful for diagnosing if a server is experiencing a network issue, or determining if the processor is overloaded.  Full article here –

To save time, we didn’t want to have to remotely connect to each machine and monitor the page reads/second separately; it would be preferable to view all of resource-use related information from a single report.

This is where Windows Performance Monitor becomes very useful.  In here, you can set up a data collector set to see how resources are being handled on a number of machines at the same time.  First, run “perfmon”, and set up a new User-Defined “Data Collector Set”.  Choose “Create Manually”, and select “Performance Counter” under “Create Data Logs”.  Pick where you’d like the log data to be saved, and then choose a user which will log in to the machines to gather these performance metrics.  This needs to be a network/domain account that has access to all servers which you will be monitoring.  Then, select “Open properties for this data collector set”, and hit “Finish”.  Ensure that the network account/user that you previously selected shows up under the “General” tab (in the “Run As” box at the lower left), and hit “OK”.

Now, right click on the data collector set and create a new data collector, of type “Performance counter data collector”.  Click “add”, and browse to the first terminal server that you want to monitor.   Select “Memory/Page reads/sec”, and click “Add > >”.   Do this for each machine that you wish to monitor, and click “OK”.  Set up a sample interval; I typically use 5 seconds if I’m running the report for a few minutes.  If its running for a few days/weeks, you’d likely want to increase the interval time to reduce disk space used by the report.

Right click on the data collector set that contains the data collector you just created, and click “Start”.  Let this run for as long as you’d like to monitor the machines of interest.  Then, right click the collector set and hit “Stop”.  Under “Reports”, you should be able to drill down to a report that graphs out the page reads/sec over the time that the collector set was running.

A number of memory specific metrics and their interpretation can be found here –

To Unicode!

The following Python snippet may be useful if you need to format data exports from multiple different pieces of software before programatically analyzing them:

def to_unicode_or_die(obj, encoding=’utf-8′):
if isinstance(obj, basestring):
if not isinstance(obj,unicode):
obj = unicode(obj,encoding, errors=’ignore’)
return obj

Running FactoryTalk Apps in XP Mode

Just a quick note:

If you find that you need to run an app in XP Mode but run into FactoryTalk Directory issues, try disabling “Integration Features”.

VBA – Split a string on the first occurrence of a number

I received an IO list with tags that needed a “-” put in them to separate the letters from the characters. Like any respectable engineer, I googled for a quick solution. Double checking to make sure I wasn’t on Bing, the results that came back were as effective as a liberal arts degree. They either didn’t address the problem or resolved it with more lines than even Gary Busey could handle.

The code below looks for the first instance of a number in the string, and then separates that string into two new ones. Then, it combines the two strings and places a “-” inbetween them.

Before running the code: HIC00605
After running the code: HIC-00605

Sub SplitOnFirstNumber()

Dim CombinedString As String
Dim LastRow As Long
Dim str As String
Dim i As Integer

str = ActiveCell.Value

For i = 1 To Len(str)

If IsNumeric(Mid(str, i, 1)) Then

LeftString = Left(str, i – 1)
RightString = Right(str, Len(str) – i + 1)

Exit For
End If
Next i

CombinedString = LeftString & “-” & RightString

End Sub


The variable CombinedString now has your finished result.

That’s it! Even I was surprised when I didn’t use the split function to solve the split problem!

FactoryTalk View “Personalized Information” Error

Recently ran into this issue after updating my FactoryTalk Activation.  The solution was as simple as deleting a few files.

Per Arun Koottz at PLCTalk:

just go to C:\ProgramData\Rockwell Automation\FactoryTalk Activation… then remove the .rnl files from there…
then restat ftv/ftc…

Works well again!

Wonderware Historical SQL Queries (now with CTE!)

Just an example of how to use a CTE to pull data from the Wonderware historical SQL data directly.  This is on SP 2012.

In this case, we also LEFT JOIN the table containing the description of the tags that we are pulling to aid in quickly understanding which tags are which.

The CTE is used to allow us to perform the left join and also use a “LIKE” in the following WHERE clause.  As a reminder, “LIKE” wildcards are:

_ = Single character wildcard

% = any number of characters wildcard

DECLARE @StartDate DateTime
DECLARE @EndDate DateTime
SET @StartDate = ‘2013-01-29 06:00:00’
SET @EndDate = GetDate()
WITH cteJoinDescription AS (
SELECT History.TagName as hTagName ,DateTime = convert(nvarchar, History.DateTime, 21) ,History.Value as Value, Tag.Description
FROM [Runtime].[dbo].[History]
LEFT JOIN [Runtime].[dbo].[Tag]
ON History.TagName = Tag.TagName
WHERE wwRetrievalMode = ‘Average’
AND wwResolution = 300000
AND wwVersion = ‘Latest’
AND DateTime >= @StartDate
AND DateTime <= @EndDate )

SELECT hTagName ,DateTime ,Value, Description
FROM cteJoinDescription
WHERE hTagName LIKE (‘AFewSimilarTags______To%Get’) OR
hTagName LIKE (‘WonderwareSampleTag.ind’) OR
hTagName IN (‘OneTagWeWantToPoll’) OR
hTagName IN (‘thisothertag_thatwewanttosee’)

Alarm Client Scripted Querying

When using the DistributedAlarmClient in an Archestra Graphic Symbol, you might want to filter/query within a script.

An easy way to do this is with the following (for Historical mode, or clientmode >= 3):

If DistributedAlarmQuery.ClientMode >= 3 Then
DistributedAlarmQuery.QueryFilters.DefaultFilter.FilterCriteria = “(Group Like ‘%” + MyAreaNameHere+ “%’)”;
DistributedAlarmQuery.Favorite = “Default”;

Kept trying to figure out a way to programatically make a new query filter ‘favorite’; didn’t realize that I could just as easily call the existing “Default” favorite which can be modified using the “QueryFilters.DefaultFilter” attributes.

Factorytalk View Graphic Text/String Find and Replace

A neat note from the Knowledgebase regarding the ability to replace Strings inside graphic displays.  This is useful if you change a tag/object name inside a PLC, and need to  change the OIT instance referencing that object.

From Answer ID 29942:

“This tech note discusses how you can perform search and replace operations on STRINGS that are contained within graphic displays. When we refer to the term STRINGS in this document, we are referring to tags, expressions, commands or text assigned to objects. Text can be captions, tool tips or label text.

A quick overview of these steps is described as follows:
Step 1: Export the graphic displays to XML files.
Step 2: Search through these exported XML files looking for STRINGS and save those STRINGS found to a second set of XML files. As well, a cross-reference listing of all the STRINGS found in all the XML files will be saved in a single text file named ‘$All Strings Found.txt’.
Step 3: Perform search and replace of STRINGS within this second set of XML files and save those changes back into the second set of XML files.
Step 4: Import this second set of XML files back into the original graphic displays. This will cause the STRINGS that were replaced in the XML files to also be replaced in the graphic displays.

The steps above are implemented using the graphics ‘Import and Export’ wizard found in FactoryTalk View Studio and the ‘FTView Graphics Strings Search and Replace’ program that accompanies this technote. The attached zip file contains the program and a help file.”

HT to Doug Brock for listing this as well as the Project Documenter.

Multiagent Process Control

In a brief discussion a few weeks ago with Rockwell Automation after a presentation at EnergyTech 2012,  I became more interested in the possibility of using agents in slow processes (water/wastewater), and was directed to a few good papers, one being this:

While that paper is focused on optimizing an economic goal, the possibility of extending this to other objectives (like a minimum necessary flow rate, or tank fill level) could be quite powerful if the right architecture (multiple physical flow paths/PLCs controlling fairly interchangeable plant functions) are in place.  For starters, using agents for multiple influent settling tanks to dynamically control the influent gates to each tank, based on what the process parameters at the other tanks (and the total influent) is at a given point in time.

If you’re interested in looking further at the RA approach, elsewhere, Rockwell calls their multiagent solution the “Autonomous Cooperative System”.