On 9/10/2014 7:13 PM, Erik ?sterlund wrote:
> Hi David,
>
> On 09 Oct 2014, at 06:40, David Holmes <david.holmes at oracle.com> wrote:
>
>> Hi Erik,
>>
>> Sorry for the further delay on this.
>
> No problem.
>
>> Not sure I like VM_HAS_SPECIALIZED_BYTE_CMPXCHG. Would like to hear opinions from others. (I also can't confirm the correctness of the assembler code nor the changes to the .il files, so other's need to weigh in here regardless.)
>
> Okay. I did this in order not to break the code for maintainers of ports outside of our repo. Did not want a performance optimization for x86 to be intrusive and break the code of all others. I'm open to suggestions if anyone else has a better solution to this (that does not need virtual calls). And if you want me to ruthlessly just define a jbyte Atomic::cmpxchg inline member function that forwards to the old solution for the platforms in our repo (and breaking the code for all others outside), I can do that too.
>
>> With this approach does every arch need an entry in stubGenerator_<arch>.ccp of the form:
>>
>> StubRoutines::_atomic_cmpxchg_byte_entry = <something>;
>>
>> ? If we have to touch the code for each arch then maybe simply duplicating what is presently the shared implementation is a better way to go?
>
> This is currently used only by very awkward configurations that have no compiler support to generate the machine code I want. Instead it is generated at runtime while bootstrapping the jbyte Atomic::cmpxchg member function.
Some platforms need to dynamically adjust the runtime code based on the
actual version of the platform in use.
> Other platforms need not care about defining such entry as it is not being used by them. Unless they actually want to use it of course. :)
So I'm not sure if the answer to my question is yes, no, or maybe :)
>> In src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp you have:
>>
>> 126 #define VM_HAS_SPECIALIZED_BYTE_CMPXCHG
>>
>> and
>>
>> 220 #define VM_HAS_SPECIALIZED_BYTE_CMPXCHG
>
> I put the #define next to the Atomic::cmpxchg definitions, and Windows has two such definitions because of different possible configurations, that's why. One of the configurations generates a stub at runtime, the other one has compiler support. In case a configuration for Windows was added with no support for jbyte Atomic::cmpxchg, we would otherwise run into a bit of trouble. It won't be #defined twice for the same reason the member function is not defined twice.
I looked for, but missed the #ifdef AMD64. Must admit I don't understand
why the 32-bit and 64-bit windows code is different for these atomic
functions.
David
> Thank you for having a look at this David! :)
>
> /Erik
>
>>
>> Thanks,
>> David
>> -----
>>
>> On 2/10/2014 5:41 AM, David Holmes wrote:
>>> Hi Erik,
>>>
>>> Currently at JavaOne and then travelling back to Australia tomorrow. It
>>> will be Tuesday (Holiday Monday) before I can take another look at this.
>>>
>>> Sorry.
>>>
>>> David
>>>
>>> On 2/10/2014 4:32 AM, Erik ?sterlund wrote:
>>>> Hi,
>>>>
>>>> There is now a new webrev with my jbyte cmpxchg changes here:
>>>> http://cr.openjdk.java.net/~jwilhelm/8058255/webrev.01/
>>>>
>>>> The change from last time is:
>>>> 1. Fixed an issue where cmpxchgl was incorrectly emitted instead of
>>>> cmpxchgb for one of the configurations as pointed out by Andrew Haley,
>>>> cheers for finding that
>>>> 2. Changed the "design" around a bit to make it easy to override the
>>>> cmpxchgb for different platforms without having #if
>>>> defined(PLATFORM_X) || defined(PLATFORMY) etc. in the headers.. If not
>>>> explicitly overridden, the old general solution be used instead like
>>>> normal and this should hence not break any of the different ports to
>>>> other architectures not included in the main hotspot project.
>>>>
>>>> Jesper ran it through jprt and it was all green. Thanks for checking
>>>> that!
>>>>
>>>> Would be nice if I could get some reviews. :)
>>>>
>>>> Thanks,
>>>> Erik
>>>>
>>>>
>>>> On 20 Sep 2014, at 23:53, Erik ?sterlund
>>>> <erik.osterlund at lnu.se<mailto:erik.osterlund at lnu.se>> wrote:
>>>>
>>>> I don't want to break the ported builds so I made a special variant of
>>>> the third option where the x86 inline methods also have a #define next
>>>> to them.
>>>> The atomic.inline.hpp file then simply defines an inline
>>>> Atomic::cmpxchg calling the general solution if there is no #define
>>>> for an override.
>>>>
>>>> That way it's using variant three, but there is no need to write
>>>> overrides for every platform port (which are sometimes in other repos)
>>>> there is to simply run the default member function.
>>>> We can add them one by one instead. :)
>>>>
>>>> Hope this seems satisfactory.
>>>>
>>>> Thanks,
>>>> Erik
>>>>
>>>> On 17 Sep 2014, at 14:28, David Holmes
>>>> <david.holmes at oracle.com<mailto:david.holmes at oracle.com>> wrote:
>>>>
>>>> On 17/09/2014 9:13 PM, Erik ?sterlund wrote:
>>>> I am back! Did you guys have time to do some thinking? I see three
>>>> different solutions:
>>>>
>>>> 1. Good old class inheritance! Class Atomic is-a Atomic_YourArchHere
>>>> is-a AtomicAbstract
>>>> Using the CRTP (Curiously Recurring Template Pattern) for C++, this
>>>> could be done without a virtual call where we want inlining.
>>>>
>>>> I would prefer this approach (here and elsewhere) but it is not a
>>>> short-term option.
>>>>
>>>> 2. Similar except with the SFINAE idiom (Substitution Failure Is Not
>>>> An Error) for C++, to pick the right overload based on statically
>>>> determined constraints.
>>>> E.g. define if Atomic::has_general_byte_CAS and based on whether this
>>>> is defined or not, pick the general or specific overload variant of
>>>> the CAS member function.
>>>>
>>>> Not sure what this one is but it sounds like a manual virtual dispatch
>>>> - which seems not a good solution.
>>>>
>>>> 3. Simply make the current CAS a normal function which is called from
>>>> billions of new inline method definitions that we have to create for
>>>> every single architecture.
>>>>
>>>> I think the simple version of 3 is just move cmpxchg(jbtye) out of the
>>>> shared code and define for each platform - there aren't that many and
>>>> it is consistent with many of the other variants.
>>>>
>>>> What do we prefer here? Does anyone else have a better idea? Also,
>>>> should I start a new thread or is it okay to post it here?
>>>>
>>>> Continuing this thread is fine by me.
>>>>
>>>> I think short-term the simple version of 3 is preferable.
>>>>
>>>> Thanks,
>>>> David
>>>>
>