Virtual Types

DarScott · Post by **DarScott** » Mon Jun 03, 2013 11:39 pm

Perhaps this pondering is a year late. Even so, I mention this in hopes that this might be seen as a good thing, or a few decisions that can go either way might be influenced.

Earlier this year I read Taming the Monolith or something like that. That described breaking LiveCode into manageable modules.

LiveCode has been described as typeless. Another view is that it has about two and a half types. Values can be arrays. Values can be strings. Values can also be numbers but they are much like strings.

I have not checked out the source yet, so I can say... It might be that there are other types of values in implementation that are virtually strings. For example, it might be that booleans are stored as particular types, but the folks at RunRev have hidden that so well that I cannot tell. They are virtually strings. Storing booleans as virtual strings, might make execution faster. Logical operators and if-then can work faster. I really have no idea whether this is done. And this is my point. There are some values that are passed from the output of one function to the input of another and never have to be converted to strings. That can speed processing. But the user never sees it and the language need not be complicated by it.

I suggest that this notion be accommodated at the appropriate execution processing level.

Suppose it becomes important to add a module to implement functions of some domain. Typically the output of these functions are used as inputs to others, but in many cases, perhaps all, the intermediate values can be useful in that domain. The inputs and outputs of these functions might be defined as strings that are meaningful to people familiar with the domain. However, the conversion to string and back might be expensive. Also, most of the use might be to feed another function. It would enhance performance to have the general processing carry these values around unconverted to strings. Conversion to strings is (to the extent appropriate for the engine) lazy.

The functions of such a module would have to comply with certain axioms and some functions would have to apply. For example, conversion to a real string might be required. Equality testing might be recommended. An ordering function might be allowed if it complies with the expected ordering after those are converted to strings.

It might be that some intermediate values do not have a ready text interpretation, that users always apply more processing. In this case, if the design says these are the right functions, then maybe, maybe, a hex dump of data is the string interpretation. I hope this will not be the case.

In all cases, the user never sees it, except, the designer might hope, in surprisingly good performance. Or maybe the lack of disappointing performance.

Like I said at the top, maybe the interface from general execution to type modules is pretty much set in concrete. I am coming in late with this.

(This can also apply to data that are virtual arrays. For example, it might be that so many functions work with count-indexed arrays that it would be handy to have a hidden internal form. Moving data from one function to another and using it can be fast. However, the user never knows.)

mwieder · Post by **mwieder** » Mon Jun 03, 2013 11:57 pm

Values can also be numbers but they are much like strings.

LOL.

LCMark · Post by **LCMark** » Sun Jun 16, 2013 12:40 pm

@DarScott: Apologies for taking a while to respond to this - a lot of what you are talking about is covered by what we are doing in the refactor project. I will be writing more in detail on the topic in the near future (when I'm able to make a spare afternoon) but for now...

In the refactor project we have changed the way the engine deals with types.

In the current engine a variable can hold either a string, a number or an array - with conversions occurring at the point of use depending on context (e.g. if a string is passed to a function that requires a number, an attempt is made to convert it to such at the point the function is evaluated).

In the refactor branch, the idea stays the same except that values are abstracted to a reference-counted 'MCValueRef' opaque object. Value's can be booleans, integers, strings, arrays, lists etc. and stay that way until the point of conversion to another type (as required by the context that they are being used in). It is these 'values' that variables store.

One consequence of this is that 'strings' are no longer the de-facto type (they ceased being this when arrays were introduced), it just happens that many types have a conversion between them and 'string' but this doesn't need to occur until something wants to use said type as a string.

DarScott · Post by **DarScott** » Sun Jun 16, 2013 4:36 pm

What's happening with the refactor project is exciting.

I was wondering... Are the new value types (booleans, integers, lists, etc) all virtually strings? That is, can I by LiveCode script tell the difference between x and (x&empty)? To put it another way, is this just an implementation issue, or is this a language issue?

One hope of mine, too shocking to mention before, would be to get rid of numbers. Virtually. All numbers, in whatever form underneath, are numerals. This would mean introducing a decimal point and breaking some usage of numberFormat, the latter being a high cost. It might have to come with other number improvements to justify that high cost. In that dream world, the types are simpler, namely strings and arrays of strings.

I think strings and arrays of strings are simple. However, if the performance conscious keeps getting surprises, then maybe that breaks. Even though that is two types, there is some oneness about them.

And to complicate things, the language requires an object as the argument to some functions.

mwieder · Post by **mwieder** » Sun Jun 16, 2013 6:33 pm

OK, yes, I *am* shocked by the idea of getting rid of numbers

Would the difference then disappear between 1 + 2 and "1" + "2"?
In the first case I'd expect to get 3, and in the second case I'd expect "12".

I think what I'd like in my dream language is a ruby-like repreentation where everything is an "object"
There are objects represented as strings and there are objects represented as numbers.
There are objects subclassed from other objects.
But at the end of the day, you're manipulating objects.

DarScott · Post by **DarScott** » Sun Jun 16, 2013 6:56 pm

I don't think it is reasonable to assume + is concatenation. Most of the mathematical treatment of concatenation over the past centuries have used juxtaposition, circle plus or (more recently) <>.

Also, it is not concatenation in LiveCode. It is addition. And it works for both "1" and 1. So, LiveCode is close to getting rid of numbers. So close that it is annoying.

I come from an fp background, so I don't buy into the gratuitous use of objects. LiveCode has the concept of the chunk, which might not have been fully explored. The use of chunks and the use of virtual numerals are consistent.

The goal should be make LiveCode a good language, not just another language. There are plenty of just another language.

However, I know my thinking is weird.

LCMark · Post by **LCMark** » Mon Jun 17, 2013 10:29 am

Well, where we are going on the refactored branch is to essentially make all values 'objects' - they are all variants of an opaque type MCValueRef - and although the list is fixed right now (for the most part), the idea has very much been modelled on CoreFoundation where a type is defined through implementation of a few simple methods.

In terms of 'getting rid of numbers', I think I kind of see where @DarScott is coming from in a sense but not sure it is a viable path. Whilst it is true that certain types can be completely represented (faithfully) as a string (booleans, numbers - particularly if the underlying arithmetic is decimal rather than binary fp, and strings) beyond that you have to start 'encoding' things to preserve faithfulness.

For example, there has long been (going back to HyperCard) noises about how delimiters work in string-lists (Bug 10727) - however, all the arguments against the current functionality are essentially just wishing for more than can be (sometimes one can have one's cake and eat it, other times one simply cannot). It comes down to a simple point of logic - if one wants lists-represented-as-strings that can represent 0-n items, then the trailing delimiter must be optional. If the trailing delimiter is not optional, one cannot either represent 0 items, or 1 item. I think, when given the choice, people would choose to be able to represent any number of items (and remember to always terminate their lists with a trailing delimiter if the list can contain empty elements), rather than not being able to represent certain numbers of elements. Of course, the underlying problem here is that lists are strings - and it is the attempt to massage what is a structured data type into an unstructured one is the problem.

Now, it is true, one can encode structure very well in strings by escaping and using things such as { } to show the boundaries of the lists. If we were to do this, then things that return lists would probably look like: "{item 1,item 2}" - which means that when you want to display them you'd have to process them in some way first - which, at the moment you don't have to do. Indeed, essentially, we would be introducing a distinction between a-string-list-in-encoded-form, and a string-list-in-display-form. It seems much more sensible to me that there should be a 'list' type that is structured, can represent any number of items and have items that can contain any value (including strings containing the delimiter); then, when that list is needed as a string (say for putting into a field) it gets converted to a string *at that point*.

DarScott · Post by **DarScott** » Mon Jun 17, 2013 3:59 pm

@mwieder, I didn't mean to sound as argumentative as it seems when I reread my comment. I think your being shocked, is an important data-point in this.

@runrevmark et al, if any values can be in items of a new list, then conversion to strings can lose information. There can be some way in which it does not, but that might not be a convenient form. (A simple case is quoting delimiters.)

However, given that associative arrays are here, then the language view of an underlying list can be arrays. That is, arrays with keys 0...n generated certain ways might be implemented a special way, but can be converted to arrays if need be and arrays subscripts work. Sequences (or lists) can then be special cases of those. It might be that implementing lists this way is free as far as language concepts.

I think saw a document from runrev which had the notation { 0 --> "a" } (or something similar) for creating a singleton array. That might be handy even in LiveCode. Then "," or "&" can be overloaded for list concatenation. Or some other method used. If there is a convenient way to make a singleton (such as above) then there need no ambiguity when adding a list as a single item to a list. Strings can be automatically promoted.

The notion is that underlying lists don't have to emerge into the language except that perhaps some operators work with arrays efficiently as lists.

However, it seems there are cool things on the way already, and I will have to wait and see.

In general I favor cool ways to implement virtual strings and arrays with underlying objects, but am slow to cheer new types in the language itself.

mwieder · Post by **mwieder** » Mon Jun 17, 2013 5:19 pm

@Dar - please be argumentative about things. That's why we're here. I need someone to tell me when my brainstorming has gone off the deep end, and I hope to provide that sort of backstop for others. If we don't argue about the way to move forward we'll never figure out what's best.

mwieder · Post by **mwieder** » Mon Jun 17, 2013 5:33 pm

but am slow to cheer new types in the language itself.

The new type I'd like to see implemented is a collection. As close to a class as we can get without scaring people off.

For instance, you can't implement a linked list right now as a generic type. I'd like to do something like

Code: Select all

new collection listItem
  local nextItem
  local previousItem
  local data
end collection

new collection myLinkedList
  local head
  local tail
  local listItemArray
  function next
end collection

new listItem myData
add myData to myLinkedList
insert myData into myLinkedList before someOtherData

DarScott · Post by **DarScott** » Tue Jun 18, 2013 2:07 am

@mwieder, I assume you mean linked list in the sense that it is a list that can do some things in some order of time. Why not just do that with an array? Your commands "add" and "insert" do not depend on a list being linked underneath.

I suspect I am missing what is important here.

mwieder · Post by **mwieder** » Wed Jun 19, 2013 6:03 am

@Dar- the problem is defining a linked list in a generic sense. I have an application that needs three linked lists. I can make three completely distinct linked lists with absolutely no interaction and no commonality, or I can design a common list mechanism and have all three lists use it. The latter is what I've done, and the complication is with the data storage - it has to be distinct for each of the three different linked lists, and so I end up with a structure where each list individually does some common task like adding a link to an array, and then does some special-purpose stuff to deal with its data storage. It's ugly and messy and hard to maintain and error-prone if I'm not really careful.

mwieder · Post by **mwieder** » Wed Jun 19, 2013 5:16 pm

The other limitation about arrays is that adding functions to them is a bit of a kludge since we don't have pointers, so they can be hashes or analogs for C structs, but it's more difficult to emulate C++ classes.

DarScott · Post by **DarScott** » Wed Jun 19, 2013 5:40 pm

@mwieder, I have emulated objective-C objects in shallow ports, but for the most part I let arrays be able to do what they do best.

One little thing that arrays are missing is the ability to pass a sub array as a reference in user commands and functions. Perhaps we can agree that that can be fixed. (I say little, but there might be some bad side effect problems.)

I'm not really sure how a solution "wants" linked lists. One would create a handful of functions and commands for the list. Outside of those, the list is opaque, and whether linked or not is unknown. (Opaque, except for the above feature enhancement not being implemented.) The scripts outside those don't care. Except maybe for the speed of some operations. Is that what you are getting at? You want certain operations to be faster? We can make both simple lists and linked lists with arrays.

I am strongly against pointers. Maybe, maybe references in some places. Pointers are just indexes into memory. Then why not use indexes into an array. You can create arbitrary graphs. The virtual types I was talking about would naturally increase performance with no additions to language concepts. Pointers are for machine language.

I come from a functional programming background, which, to me, is more natural. So, I have no compulsion to try to turn everything into objects. (I did have to learn not to turn every LiveCode function into a one line return.)

mwieder · Post by **mwieder** » Wed Jun 19, 2013 6:12 pm

(I did have to learn not to turn every LiveCode function into a one line return.)

<g>
I don't really want to hijack this topic into a discussion of classes or arrays/hashes. Yes, I like arrays the way they are, with the addition of being able to pass references to array elements around.

LiveCode Forums

Virtual Types

Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types