Announcing New Text and Array Functions

11 min read Original article ↗

Blog Post

JoeMcDaid's avatar

September 29th 2022 Update
These functions are now fully deployed to Excel for the Web and users of Office 365 on the Current Channel.

I’m thrilled to share with you the availability of 14 new Excel functions designed to help you more easily manipulate text and arrays in your worksheets. 

Text Manipulation Functions

When working with text, a common task to complete is “break apart” text strings using a delimiter. You can already do this with combinations of SEARCH, FIND, LEFT, RIGHT, MID, SUBSTITUTE, and SEQUENCE, but we’ve heard from many of you that these can be challenging to use.

To make it easier to extract the text from the start or end of a cell’s contents, we are releasing two functions that simply return everything before or after your selected delimiter. Welcome, TEXTBEFORE and TEXTAFTER!

We’ve also made it easy to “split” text into multiple segments using TEXTSPLIT. Each text segment is then automatically spilled into its own cell through the magic of dynamic arrays. 

• TEXTBEFORE - Returns text that’s before delimiting characters

• TEXTAFTER - Returns text that’s after delimiting characters

• TEXTSPLIT - Splits text into rows or columns using delimiters

Array Manipulation Functions

Since the release of dynamic arrays in 2019, we’ve seen a large increase in the usage of array formulas. To make it easier to build compelling spreadsheets using dynamic arrays, we are releasing a collection of 11 new array manipulation functions.  

Combining Arrays

It can be challenging to combine data, especially when their sources are flexible in size. With VSTACK and HSTACK, you can easily combine dynamic arrays, stacking your data vertically or horizontally. 

  • VSTACK - Stacks arrays vertically
  • HSTACK- Stacks arrays horizontally

Shaping Arrays

It has been challenging to change the “shape” of data in Excel, especially from arrays to lists and vice versa. If you find yourself with a two-dimensional array that you would like to convert to a simple list, use TOROW and TOCOL to convert a 2D array into a single row or column of data. 

Using the WRAPROWS and WRAPCOLS functions, do the opposite: create a 2D array of a specified width or height by “wrapping” data to the next line (just like the text in this document) once your chosen width/height limit is reached. 

  • TOROW - Returns the array as one row
  • TOCOL - Returns the array as one column
  • WRAPROWS - Wraps a row array into a 2D array
  • WRAPCOLS - Wraps a column array into a 2D array

Resizing Arrays

Arrays too large? No problem. Enter the TAKE and DROP functions! They enable you to reduce your arrays by specifying the number of rows to keep or remove from the start or end of your array.

Similarly, using CHOOSEROWS or CHOOSECOLS, you can pick specific rows or columns out of an array by their index.

EXPAND allows you to grow an array to the size of your choice—you just need to provide the new dimensions and a value to fill the extra space with. 

  • TAKE - Returns rows or columns from array start or end
  • DROP - Drops rows or columns from array start or end
  • CHOOSEROWS - Returns the specified rows from an array
  • CHOOSECOLS - Returns the specified columns from an array
  • EXPAND - Expands an array to the specified dimensions

Scenarios to try

  • Use “ “ (space) as a delimiter with TEXTBEFORE to extract the first name and TEXTAFTER to extract the last name 
  • Use TEXTSPLIT to separate the names into an array with “ “ (space) as a delimiter 

When you want to combine two ranges of data: 

  • Use VSTACK to combine two ranges of data vertically 
  • Use HSTACK to combine two ranges horizontally 

Availability 

These functions are currently available to users running Beta Channel, Version 2203 (Build 15104.20004) or later on Windows and Version 16.60 (Build 22030400) or later on Mac.

Don’t have it yet? It’s probably us, not you.

Features are released over some time to ensure things are working smoothly. We highlight features that you may not have because they’re slowly releasing to larger numbers of Insiders. Sometimes we remove elements to further improve them based on your feedback. Though this is rare, we also reserve the option to pull a feature entirely out of the product, even if you, as an Insider, have had the opportunity to try it.

Feedback

If you have any feedback or suggestions, you can submit them by clicking  Help > Feedback. You can also submit new ideas or vote for other ideas via Microsoft Feedback.

Want to know more about Excel? See What's new in Excel and subscribe to our Excel Blog to get the latest updates. Stay connected with us and other Excel fans around the world – join our Excel Community and follow us on Twitter.

Joe McDaid (@jjmcdaid)
Program Manager, Excel

Updated Sep 29, 2022

Version 11.0

JoeMcDaid's avatar

Joined

September 06, 2016

Excel Blog

Subscribe to the Excel Blog to get the latest product announcements and updates

279 Comments

  • Andrew_Duncan54's avatar

    Is there any chance that a full conventional Regex will be introduced into Word 365?

  • jignesh0yt2k's avatar

    Keith_A_Lewis 

    Agreed - They are in C / C++ but the fact remains that in my scenario, when the Regex is called from new excel function it does not work but when called from Python it works flawlessly as shown above. As I have not delved deeper into the anatomy of regex but rather focussed on their usage, it is possible that I do not know certain things. I am actually not a programmer or developer but rather explorer of Regular expressions in their usage in commercial scenario (finance and others).

  • Keith_A_Lewis's avatar

    jignesh0yt2k You seem to be under the misconception that the regular expressions are written in Python. They are written in C or C++ and get called from Python. They can be called from any language.

  • jignesh0yt2k's avatar

    JakeArmstrong 

    I think this array of arrays is a common problem shared by many others in previous posts. Python handles it aptly but excel finds it challenging except with complicated maze of nested LET and other new excel 365 native functions recently introduced, as demonstrated above. I am reluctant to go on that path with a risk of being lost without solution but rather am interested in finding simple solution may it be with excel native functions or without them using workarounds like using Python UDF with xlwings as bridge.

    Therefore, to keep it simple, speedy and understandable to me, I have moved on the hybrid path instead of an only-strictly-excel solution.

    As for Regular Expressions, understandably as these excel functions been just introduced, they have yet to mature to Python's  level (Not sure - maybe I am wrong but as demonstrated above, these work smoothly with Python as backend rather than without it in, at least in my case).

  • jignesh0yt2k's avatar

    JakeArmstrong 

    Thank you for your efforts.

    Here is the shared file link - File has been uploaded to onedrive: https://1drv.ms/t/c/49736323b50e9a8c/EZmyHnCS_IpGmcU9giOkEPcBBFwENZmQuaaRScX5H70osQ?e=WwOhr9

    After including "^" in the beginning, the result is as under (given at the end). I think there are some incompatibilities between how Python and excel recognizes the regular expressions (or may be other reasons not known to me). As stated above, my actual use case is little complex and involves multiple regular expressions spanning 2 columns (finally combined using boolean 1 or 0 by multiplication). Patterns involve both lookaheads (negative and positive) expressions in the same pattern.

    My final function in excel (using python UDF nested with excel's native functions) that returned correctly, the scheme name and NAV data with desired filtering using Regex is:
    =SORT(DROP(UNIQUE(IF((REGEXFINDM(PQ___Python_Combo___2021[Scheme Type],CONCAT(AB4))<>"")*(REGEXFINDM(PQ___Python_Combo___2021[Scheme Name],CONCAT(AB3))<>"")=1,SPLIT_TEXT(PQ___Python_Combo___2021[Scheme Name]&";"&VALUE(PQ___Python_Combo___2021[Net Asset Value]),";"),0)),1),1,1)   [Patterns used: 

    (?i)^((?=.*direct)|(?=.*growth)|(?=.*gold)|(?=.*silver))(^((?!(regular|idcw|dividend|hybrid|balanced advantage|index|nifty)).)*$) (for filtering Scheme Name Column) [Cell AB3 in the function]

    and

    (?i)^((?!(equity|hybrid|Solution Oriented|FOF|elss)).)*$ (for filtering Scheme Type Column) [Cell AB4 in the function]

    Excel's new REGEXEXTRACT function (Pattern modified and (?i) deleted (^ included) to make it compatible with excel - Example is only for Cell AB4 pattern) :

    ^((?!(equity|hybrid|Solution Oriented|FOF|elss)).)*$ [Pattern]

    Output:

    Open Ended Schemes ( Money Market )
    Open Ended Schemes ( Money Market )
    Open Ended Schemes ( Money Market )
    Open Ended Schemes ( Money Market )
    #N/A
    #N/A
    #N/A
    #N/A
    #N/A
    #N/A
    #N/A
    #N/A
    #N/A
    #N/A

    Update: Return mode 0 also did not change the above output: =REGEXEXTRACT(PQ___Python_Combo___2021[Scheme Type],AJ6,0,1)

  • JakeArmstrong's avatar

    jignesh0yt2k, unfortunately I'm not able to download and import your data. but it seems to me that removing the "^" is changing your regex pattern. "^" is supported in the new REGEX functions, so perhaps it's worth looking into that. If you're getting #N/A! from the function, it means that it's not finding any matches. I would also recommend using [return_mode] = 0, which will only return the first result. Since you're using a structured column reference for your text argument, I assume that input has multiple values. This will cause you to run into the 'arrays of arrays' issue if you get multiple matches returned for multiple entries of text.

    Please let me know if any of this sorts out your issues!

  • jignesh0yt2k's avatar

    Hello SergeiBaklan 

    Appreciate your response but I also want to achieve case insensitivity like for example FOF, fOf, FOf, etc and any further permutations / combinations of all these words (FOF and others above) should be covered in the condition for which I think (not sure) regular expressions is the answer. Not sure about support for case insensitivity in power query - Haven't delved deeper in to that part.

    Actually what I have covered above is just an example and my actual function in excel with regular expressions span two columns in excel: 
    For the first column the pattern is:  (?i)^((?!(equity|hybrid|Solution Oriented|FOF|elss)).)*$ and for 2nd column the pattern is: (?i)^((?=.*direct)|(?=.*growth)|(?=.*gold)|(?=.*silver))(^((?!(regular|idcw|dividend|hybrid|balanced advantage|index|nifty)).)*$) (as you can see I want to lookahead positively and negatively at the same time in the second pattern) and I am combining both patterns using python UDF nested with excel IF and CONCAT functions like this:
    =IF((REGEXFINDM(PQ___Python_Combo___2021[Scheme Type],CONCAT(AB4))<>"")*(REGEXFINDM(PQ___Python_Combo___2021[Scheme Name],CONCAT(AB3))<>"")=1,PQ___Python_Combo___2021[Net Asset Value],0) Where AB3 and AB4 cells contain above patterns  and if the multiplication of combined conditions = 1, then function returns NAV else it returns 0

    Reason behind using CONCAT is my pattern text sometimes hit 255 characters limit imposed by excel.
    Hope I am clear in explaining my use case.

  • SergeiBaklan's avatar

    jignesh0yt2k , just as a comment, you may filter directly in Power Query

        ...,
        exclude = {"FoF", "Solution Oriented", "Hybrid", "Equity"},
        AddFlag = Table.AddColumn(
            CleanName,
            "Flag",
            each List.AnyTrue(
                    List.Transform( exclude, (q) => Text.Contains( [Scheme Type], q ))
                 )
        ),
        FilterByFlag = Table.SelectRows(AddFlag, each ([Flag] = false)),
        RemoveFlag = Table.RemoveColumns(FilterByFlag,{"Flag"}),
        ...

    What to exclude could be in separate table.

  • jignesh0yt2k's avatar

    JakeArmstrong 

    Here is the link to the data which is a web page containing delimited text file (";" as delimiter). I converted that to excel using power query.

    Link to text file:

    https://portal.amfiindia.com/DownloadNAVHistoryReport_Po.aspx?frmdt=21-May-2024

    I copied and pasted text to excel and converted to a table named Table25.

    Below is the text of my power query:

    let
    Source = Excel.CurrentWorkbook(){[Name="Table25"]}[Content],
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter(";", QuoteStyle.None), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6", "Column1.7", "Column1.8"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", type text}, {"Column1.4", type text}, {"Column1.5", type text}, {"Column1.6", type text}, {"Column1.7", type text}, {"Column1.8", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Scheme Code", type text}, {"Scheme Name", type text}, {"ISIN Div Payout/ISIN Growth", type text}, {"ISIN Div Reinvestment", type text}, {"Net Asset Value", type number}, {"Repurchase Price", type text}, {"Sale Price", type text}, {"Date", type date}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type1", "Scheme Type", each if [Scheme Code] is null then [Scheme Code] else if (Text.Contains([Scheme Code],"Open Ended") or Text.Contains([Scheme Code],"Close Ended")) then [Scheme Code] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Scheme Type"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Filled Down",{"Scheme Code", "Scheme Type", "Scheme Name", "ISIN Div Payout/ISIN Growth", "ISIN Div Reinvestment", "Net Asset Value", "Repurchase Price", "Sale Price", "Date"}),
    #"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Scheme Code"}),
    #"Removed Top Rows" = Table.Skip(#"Removed Columns",1),
    #"Filtered Rows" = Table.SelectRows(#"Removed Top Rows", each ([Scheme Name] <> null))
    in
    #"Filtered Rows"

    As an additional information, after data is split into columns with above PQ, I tried the new function on the column named "Scheme Type"

    Repeating my function again here:
    =REGEXEXTRACT(PQ___Python_Combo___2021[Scheme Type],AJ6,1,1) [AJ6 contains the pattern: ((?!(equity|hybrid|Solution Oriented|FOF|elss)).)*$]

    Pattern used for excel function does not seem to cover "^" meaning beginning of the string in regular expression. So I had to delete "^", else the function produced error.


    The comparative Python pattern is:
    (?i)^((?!(equity|hybrid|Solution Oriented|FOF|elss)).)*$ - where (?i) is for ignoring case sensitivity and ^ for start of the string. This pattern correctly returns data using python library re and that UDF was called via xlwings. Other details about that UDF can be found in my above responses.


    Seems, excel is yet to support lookaheads / lookbehinds - Positive and Negative - (?=,  ?!,  ?<=,  ?<!, etc.) or am I missing something? 

  • PeterBartholomew1's avatar